Transcriptional Activation by an URE4-like Sequence in the EhPgp1 Gene Core Promoter

Entamoeba histolytica is the protozoan responsible for human amoebiasis, it kills 70,000 humans each year around the world and is considered fourth in mortality after malaria, Chagas disease and leishmaniasis [1]. The parasite presents the multidrug resistance phenotype (MDR), due to the expression of a surface P-glycoprotein that transports the drug outside the cell, avoiding its therapeutic effects. In amoeba, there are four genes that code for Pgp proteins, EhPgp1, EhPgp2, EhPgp5, and EhPgp6. The EhPgp1 and EhPgp6 genes are constitutively expressed in drug-resistant mutants (clone C2). The EhPgp5 gene is induced by the presence of the drug while the EhPgp2 gene transcript is not detected [2,3]. Differential EhPgp genes expression suggests a specific control mechanism of the MDR phenotype in this parasite.


Introduction
Entamoeba histolytica is the protozoan responsible for human amoebiasis, it kills 70,000 humans each year around the world and is considered fourth in mortality after malaria, Chagas disease and leishmaniasis [1]. The parasite presents the multidrug resistance phenotype (MDR), due to the expression of a surface P-glycoprotein that transports the drug outside the cell, avoiding its therapeutic effects. In amoeba, there are four genes that code for Pgp proteins, EhPgp1, EhPgp2, EhPgp5, and EhPgp6. The EhPgp1 and EhPgp6 genes are constitutively expressed in drug-resistant mutants (clone C2). The EhPgp5 gene is induced by the presence of the drug while the EhPgp2 gene transcript is not detected [2,3]. Differential EhPgp genes expression suggests a specific control mechanism of the MDR phenotype in this parasite.
Cloning and transcriptional characterization of the EhPgp1 and EhPgp5 gene promoters from drug-sensitive and drug-resistant trophozoites showed that these were 99.7% identical, however differential complexes were formed when nuclear extracts from sensitive and resistant clones were used. These results suggest that specific transcriptional regulators may be involved in the expression of the EhPgp genes in drug-resistant cells [4,5]. Until now only some cis-regulatory elements [6][7][8][9][10][11], and very few transcription factors have been identified and characterized in gene expression of this parasite [9,[12][13][14][15][16].
Analysis of the core promoter of 37 protein encoding genes of E. histolytica revealed three conserved regions: i) the putative TATA element located approximately at -30 (GTATTTAAA(G/C)), ii) the GAAC sequence, located between the TATA box and Inr sequence, and iii) the putative Inr region overlapping the transcription initiation site (AAAAATTCA) [7]. Five major upstream regulatory elements (UREs) are present in the hgl5 gene promoter, four of them act as positive regulatory elements: URE1, URE2, URE4, URE5, whereas the URE3 motif performs a negative regulatory activity [7]. However URE3 function as a positive regulatory element in the ferredoxin (fdx1) promoter region [17]. Additionally URE1-like sequence was reported as a cis-acting ele-ment in the EhRabB gene promoter [18], and recently was identified as the protein that specifically binds to the URE1 sequence (EhURE1BP), which contains five SNase domains and one Tudor motif [19].
Gilchrist et al. [14] identified the protein that binds to the URE3 element (URE3-BP) and recently demonstrated that several genes of E. histolytica are regulated by URE3-BP. The URE3 motif was found in 54% and 39% of promoter regions of the genes modulated by URE3-BP in vitro and in vivo, respectively [20,21]. On the other hand, Schaenman et al. [13] reported that the URE4 sequence, composed of two 9 bp repeats, functions as an enhancer in the hgl5 gene and it interacts with two URE4 enhancer-binding proteins of 18 and 28 kDa called EhEBP1 and EhEBP2 [22].
In the EhPgp1 gene promoter, we identified by homology to consensus sequences reported so far, C/EBP, HOX, GATA-1 and OCT regulatory elements. Some specific oligonucleotides for these elements were able to compete against the DNA promoter in electrophoretic mobility shift assays [4]. Specific deletions of C/EBP elements, demonstrated that two CCAAAT/enhancer binding protein sites (-54 to -43 bp and -198 to -186 bp), were cis-acting elements of EhPgp1 gene expression in both drug-sensitive and resistant trophozoites [9]. In addition, two nuclear proteins of 25 and 65 kDa that were specifically binding to C/EBP probe, share epitopes with the human C/EBP tran-scription factor. However functional activities of the EhPgp1 promoter demonstrated that other sequences within -259 to -206 bp besides C/ EBP are crucial for promoter activity [9]. Previously, we demonstrated that in the -234 to -197 bp region, putative cis-activator sequences for GATA-1, GAL 4, NIT-2 and C/EBP transcription factors are present [23]. Here, we report the presence of a cis-acting element located at -226 to -218 bp of the EhPgp1 gene promoter and the putative transcription factor with which it interacts.

Search of putative consensus elements by In silico analysis
To find consensus sequences for transcription factors in the -234 to -197 bp EhPgp1 promoter region, we used the transcription factors data base TF Search version 1.3 (http://www.cbrc.jp /research/db/TF-SEARCH.htlm.).

Transfection and promoter activity
Transient transfection experiments were performed by electroporation [25], using 10 6 trophozoites and 100 µg of plasmid DNA. Electroporated trophozoites were incubated at 37°C for 48 h. Total proteins were obtained and CAT activity was measured by two phase diffusion assays [6] using 100 µg of total proteins, 200 µl of chloramphenicol (1.25 mM), 10 µl of C 14 -butyryl-CoA (4.15 mCi/mmol, NEN Life Science Products) and 4 ml of scintillation solution. The activities were measured at 2 h intervals in the linear range of assay. The assay of CAT activity was expressed as a percentage of the butyrylated derivatives.
The background activities obtained from trophozoites transfected with the plasmid without promoter (pBSCAT-ACT) were subtracted from the activities of trophozoites transfected with different promoter constructions.

Nuclear extracts preparation
Nuclear extracts (NE) from clone C2 trophozoites were obtained as described previously [4]. Protein concentration was determined by the Bradford method [26].

DNA binding protein purification
E. histolytica nuclear proteins that bind to the R9 region were partially purified using a DNA-binding protein purification kit (Roche). The Pgp1-226/218R3 double stranded oligonucleotide was used as DNA probe to obtain concatameric DNA using self primer PCR. These oligonucleotides undergo a self-priming reaction during the PCR, resulting in long concatamers with hundreds of specific binding sites. Concatameric DNA was bound to magnetic particles, then 50 µg of E. histolytica NE was added. The specific proteins were captured by the concatameric oligonucleotide with a high affinity constant, while nonspecific proteins (with a lower affinity constant) did not bind. The specific DNA-binding proteins were eluted from the immobilized particle with a high ionic strength buffer. After removal of the elution buffer by filtration with centricon YM-10 (Millipore), the proteins were transferred to nitrocellulose membranes for Western blot assays. The eluted proteins were also evaluated by supershift assays.

Western blot assays
Partially purified proteins and NE were separated on a 12% polyacrylamide gel, transferred to nitrocellulose membranes, and immunoblotted under standard conditions. Membranes were blocked with 4% non fat milk in PBS pH 7.4 /Tween 0.05%, for 2-3 h at room temperature, and then incubated with mouse anti-EhEBP1 antibody (1:600) (kindly supplied by Dr. Carol A. Gilchrist) for 1 h at 37°C. As control, we used the rabbit polyclonal antibody against the human C/EBPβ (Santa Cruz Biotechnology) (1:500). Immunoreactivity was detected by a chromogenic method using anti-mouse and anti-rabbit peroxidase labeled secondary antibodies, respectively (Zymed laboratories) (1:3000) and revealed with H 2 O 2 and 4-chloro-1-Napthol.

Supershift assays
Supershift assays were performed as the EMSA described before, using the antibodies in the reaction mix. Briefly, we used as radio labeled probe 1 ng of the Pgp1-226/218R3 double stranded oligonucleotide, 20 µg of NE, mouse anti-EhEBP1, 1 µg of poly d(I-C) and DNAprotein binding buffer. As negative controls, anti-C/EBPβ human antibody was used (Santa Cruz Biotechnology).

Identification of consensus sequences in the region from -234 to -197 bp
The structural analysis of the region from -234 to -197 bp in the EhPgp1 gene core promoter showed three repeated sequences of 7 bp each, located at the positions -203 to -197; -218 to -212; and -234 to -228 bp ( Figure 1), we have named these sequences R7(1), R7(2), and R7 (3), respectively. Also, we identified two repeats of 9 bp each located at the positions -211 to -203 and -226 to -218 bp, these sequences were named R9(1) and R9(2), respectively. In silico analysis of the -234 to -197 bp region was performed to identify potential nuclear factor binding sites. Interestingly, in the R7 sequences we detected consensus sequences for GAL4, GATA-1, C/EBPβ and NIT-2 transcription factors, as was previously reported [23]. We also identified two new consensus sequences for the GATA-2 and NF-GMb transcription factors. Additionally we localized three consensus sequences for C/EBPβ and two sequences for EhEBP1 and EhEBP-2 transcription factors that overlap with the R9 motifs ( Figure 1).

The distal region of the EhPgp1 promoter, between -234 to -197 bp contains an activator sequence
Using structural and in silico analysis of the promoter, we performed a series of deletions on the -234 to -207 bp promoter to locate cis-elements that could drive the EhPgp1 gene expression. Four different plasmids (p258Pgp1, p246Pgp1, p235Pgp1, and p231Pgp1) carrying -234 to +24, -222 to +24, -211 to +24 and -207 to +24 bp sequence of the E. histolytica EhPgp1 gene core promoter were constructed and transfected into C2 trophozoites. CAT activities were measured and compared to the p268Pgp1 plasmid (positive control). Results showed a marked reduction in CAT activity (58%) after truncation of -234 to -222 bp (p246Pgp1 plasmid), suggesting that the R7(3) and R9(2) sequences are required for the EhPgp1 gene expression ( Figure 2).
As deletions progress toward the 3'-end of the promoter (p235Pgp1 and p231Pgp1 plasmids), in which almost all the repeated regions were eliminated, a decrease of 87 CAT activities were observed, with both constructions (Figure 2). These results provide evidence that the R7(3) and the R9(2) repeated sequences are involved in the EhPgp1 transcriptional activation, but also showed the presence of another positive regulatory sequence, between the position -218 to -211 bp that correspond to the R7(2) sequence ( Figure 2).
Promoter activity comparison of p258Pgp1 (-234 to +24 bp) and p268Pgp1 (-266 to +24 bp) demonstrated that 5' deletion up to -234 bp increase the promoter activity in 27%, suggesting that the region from -259 to -234 bp could contain negative cis-regulatory elements.

A 9 bp repeated sequence is critical for driving EhPgp1 gene expression
To determine if R7(3) or R9(2) or both repeated sequences produced the major effects on promoter activity, we performed point mutations into the R7(3) and R9(2) core sequences. Mutations of one or two bases on different positions of the repeated sequences did not significantly modify the CAT reporter gene activity (data no shown). Thus, we carried out constructions containing more than three point  mutations. Neither mutations of the R7(3) sequence (p258R7(3)m4 and p258R7(3)tm plasmids) significantly affect EhPgp1 promoter activity ( Figure 3). However, three, five or eight mutations in the R9(2) site (p258R9(2)m4, p258R9(2)m5, and p258R9(2)tm plasmids) reduced CAT activity by 70%, compared to a wild-type construction (p268Pgp1) (Figure 3). These data strongly suggest that the R9(2) sequence is crucial for EhPgp1 gene transcription.

R9 sequences DNA-protein interactions
Based on our observations that R9(2) could potentially up-regulate the transcriptional activation of the EhPgp1 promoter, we investigated the ability of this repeated sequence to bind nuclear proteins from E. histolytica. Moreover, R9 is present two times in this region, we also analyzed the relevance of the presence of one, two or three repeated sequences in the DNA-protein complexes formation. To perform these, we generated a set of double stranded oligonucleotides, the Pgp1-226/218R3, the Pgp1-226/218R2 and the Pgp1-226/218R1, containing three, two, and one copy of the R9 repeated respectively and the same 6 bp present between each one, as the wild type promoter sequence. The Pgp1-226/218R3 oligonucleotide was used as probe to perform electrophoretic mobility shift assays, and the Pgp1-226/218R2, Pgp1-226/218R1 and Pgp1-234/197 (wild type promoter sequence) oligonucleotides were used as specific competitors.
Incubation of the probe with NE from clone C2 trophozoites resulted in the formation of three DNA-protein complexes called a, b and c of varying intensity (Figure 4, lane 2). To demonstrate the specificity of binding by this R9 repeated, we added an unspecific competitor which failed to compete off any DNA-protein complexes formation (Figure 4, lane 3). However, all the complexes were specifically competed with the same cold-fragment (Figure 4, lane 4). To also define whether one or more R9 sequences are necessary for the DNA-protein complexes formation, we performed competition assays using the Pgp1-226/218R2 and the Pgp1-226/218R1 double stranded oligonucleotides, both completely competed the formation of the complex a and produced the formation of a new complex with a minor electrophoretic mobility (complex d) (Figure 4, lanes 5 and 6). While the intensity of the complex b was reduced by 52% when two R9 repeats are present and a reduction of only 4% was detected when only one repeat was used in the competitor (Figure 4, lanes 5 and 6). Similarly, the intensity of complex c was reduced by 71 and 10% with two and one R9 motifs respectively (Figure 4, lanes 5 and 6). To further characterize if the Eh-Pgp1 promoter wild type sequence (-234 to -197 bp) form the same DNA-protein complexes observed with Pgp1-226/218R3, we added as specific competitor the Pgp1-234/197 double stranded oligonucleotide. Interestingly, all the complexes competed. Together these results demonstrate that the R9 repeated could serve as a recognition sequence for a DNA binding protein in the parasite extract.

An EhEBP1 recognized the R9 repeated sequences
As was observed before in Figure 1, the sequences R9 contain almost the complete EhEBP1 recognition sequence (URE4). To identify whether this protein was a component of the gel shift complexes of R9 DNA with NE from the trophozoites, we partially-purified the proteins interacting with it by DNA affinity chromatography.
Thus, we performed a Western blot assay using NE from trophozoites, the partially-purified fraction and the antibody against the EhEBP1 protein (kindly supplied by Dr. Carol A. Gilchrist). The results revealed the presence of a specific band of 28 kDa in NE and in the protein fraction ( Figure 5A, lanes 1 and 2) indicating that one of the proteins that bind to R9 is an EhEBP1-like protein. In contrast, unrelated antibodies against to human C/EBPβ did not produce any detection ( Figure 5A, lane 3).
To confirm the observation that R9 repeated sequence is recognized by an EhEBP1, we performed supershift assay using anti-EhEBP1 antibody. The results did not show a supershifted band, but caused the A.

B.
C. fading of DNA-protein complexes ( Figure 5B, lane 3), while the addition of heterologous antibody against C/EBPβ transcription factor ( Figure 5B, lane 4) had no effect, confirming the binding of EhEBP1 to the R9 repeated sequence and the specificity of immunoreactions with anti-EhEBP1 antibody.

Discussion
In this study, we have identified a cis-acting element that controls the EhPgp1 gene expression in drug resistant trophozoites (clone C2) of E. histolytica. Since EhPgp1 gene is implicated in the multidrug resistance of this parasite, the study of its transcriptional control in the clone C2 trophozoites provide an excellent in vitro system for studying molecular basis of EhPgp1 gene regulation. Our previous work has defined that transcriptional regulation of the EhPgp1 gene promoter depends on two CCAAT/enhancer binding sites (-54 to -43 and -198 to -186 bp) and other motifs present in the 53 bp upstream of the C/ EBPIII site [9]. Moreover into the 53 bp of this promoter we delimited a functional region of 38 bp (-234 to -196 bp) that interact with nuclear proteins from E. histoytica and by in silico analysis showed the presence of GATA-2 and NF-GMb sequences as well as the GAL4, NIT2, GATA-1, and C/EBP binding sites [23]. However, although we identified several consensus sequences in this region, in the present work two interesting types of repeat sequences were also located. One of them are the R7 repeats which are present three times (-203 to -197, -218 to-212 and -228 to -234 bp) and the others are the R9 repeats located in two positions (-203 to -211 and -218 to -226 bp), at the EhPgp1 promoter.
Deletions of the EhPgp1 region (-234 to -197 bp) showed that the R7(3) and/or R9(2) repeated sequences could be necessary for the EhPgp1 gene expression. One or two point mutations into these sequences did not produce any CAT activity modification, indicating that the DNA-protein interactions were not modified by these point changes. Four or more mutations into R7(3) did not affect promoter activity, suggesting that R7 (3) is not important for the EhPgp1 gene expression; however, we could not ignore the possible participation of the R7(2) and R7(1) sequences in the transcriptional regulation of the EhPgp1 gene. On the other hand, we found that the R9(2) repeat was important for promoter activity, because its deletion or mutations in three bases at the 3' side (AAAAAAATG), five mutations at the middle of the sequence (ATTCTAGTT) or eight mutations of the nine bases (TTTCTAATG) produced a 70% reduction in CAT activity. These results clearly demonstrate that the R9 repeated sequences are necessary for the EhPgp1 gene expression. Interestingly, similar results were observed when the URE4 sequence was identified and characterized in the E. histolytica hgl5 promoter. Four mutations of the middle residues (AATCTAGAA) or in the 3' side (AAAAAAATG) within the upstream repeat or mutations into the two repeated sequences produced an 85% and 93% reduction in luciferase activity in the last two conditions, respectively [22]. Additionally, they found that the upstream repeat is more relevant for the promoter activity than the downstream one, because mutations into the upstream repeat diminish 85% luciferase reporter gene of wild type levels, while mutations into the downstream repeat only decreased 39% reporter gene. They suggested that the downstream repeat may play a role of supporting binding of factors to the upstream repeat, as evidenced by the fact that separating the repeats by seven base pairs decreased reporter gene activity to 22% [22]. These results are consistent with our findings, because the deletion of the R9(2) repeat drastically diminish CAT reporter gene activity (58%), while the elimination of both R9 repeats produced an additional activity reduction of 29%. The existence of a synergistic and accumulative R9 effect involved in the EhPgp1 gene control may be possible.
Interestingly, the appearance of R9 repeat sequences in URE4 motif is not only important for their functional role, but also by the times  that are present in the EhPgp1 promoter and by its sequence. Each R9 repeat contains 9 bp (AAAAATGTT) and is separated from the other by 6 bp. These elements present a similar arrangement to the URE4 motif identified in the E. histolytica Gal/GalNac lectin heavy subunit hgl5 gene promoter [22]. The URE4 sequence is also composed of two 9 bp repeats (AAAAATGAA) but separated by only 3 bp. Two main differences between URE4 and R9 sequences are the last two bases in the sequences (AA/TT) and the distance between them, nevertheless they are very similar, suggesting that the R9 repeats form an URE4 element ( Figure 5C).
Moreover, our DNA-protein interaction assays strongly suggest that R9 motifs have been specifically recognized by nuclear proteins from amoeba and that these proteins required at least the presence of two R9 sequences. Thus because in competition assays the DNA-protein complexes disappeared or their intensity was strongly diminished (more than 70%) when three or two R9 motifs were used as competitors. Whereas one copy of this motif did not modify the DNA-protein complexes formation except the complex a, indicating that one R9 is able to interact with amoeba proteins. However, this may not happen with the same affinity and may diminish the stability of the DNA-protein interaction. In addition we observed the formation of a new low mobility complex showing that DNA-protein interactions with one, two or three R9 motifs generate different mobility complexes. Our results provide evidence that probably, the transcription factors bind in an independent fashion on each R9 sequence or at the same time to allow protein-protein interactions for the formation of a stable dimeric complex. A similar point of view was suggested by Shaenman et al. [13] because they observed a decrease in the hgl5 promoter activity when they modified the spacing between the URE4 repeats. In other systems this kind of interactions has been also reported like the protein binding to the retinoic acid response element or the participation of the NIT2 nitrogen regulatory protein of N. crassa in the turning on expression of different structural genes through its binding to GATA [27]. The monomeric form of this protein can bind to the GATA sequence with very low affinity, however when two GATA sequences are in close proximity, two monomeric NIT2 proteins can interact with each other to stabilize the interaction [27]. These results demonstrate that the presence of two R9 motifs favors its recognition by nuclear proteins from E. histolytica and may be activating the EhPgp1 gene expression.
Finally, because the R9 repeats may be an URE4 element ( Figure  5C), it is possible that the EhEBP1 protein can bind to the R9 motifs of the EhPgp1 promoter to induce its expression. This assumption is supported by two findings: i) the Western blot assays performed with the partially purified R9 binding proteins using the anti-EhEBP1 antibody revealed a 28 KDa protein, and ii) in the super shift assays, DNAprotein complexes totally competed with the anti-EhEBP1 antibody. Consistent with these results, previous reports [13,22] showed that antibodies against EhEBP1 recognized two proteins of 28 and 18 KDa in E. histolytica extracts and were able to compete the URE4-protein interaction of the hgl5 promoter. Interestingly, overexpressions of the 28 KDa EhEBP1 repress the hgl5 gene transcription through its ability to recognize the URE4 motif. In our case, EhEBP1 could be participating in the transcriptional activation of the EhPgp1 gene through its interaction with the R9 repeats. Several transcription factors like Sp3, GATA and C/EBP between others have been reported to act both as a positive or negative regulator of transcription depending on promoter and cell type [28][29][30]. A similar role seems to be occurring with the EhEBP1 in the amoeba; however, more in depth investigation will be required to know more about this protein and its functional role in the transcriptional control of the EhPgp1 and other genes.
Taking these results together and in concordance with our previous reports [9,23], we proposed a model to address molecular insights into regulation of the EhPgp1 gene. In this model, we showed that transcriptional control of the EhPgp1 gene is coordinated by C/EBPI, C/ EBPIII and R9 motifs that have been recognized by C/EBP-like and EhEBP1 transcription factors respectively. Then, these complexes may interact with basal machinery of transcription and activate the EhPgp1 gene ( Figure 6). However, the chronological events that drive the binding of each transcription factor and how these interact to enhance the EhPgp1 transcriptional activation, poses an open question to be solved.