LINE-1 Based Insertional Mutagenesis Screens to Identify Genes Involved in Embryonic Stem Cell Differentiation

Background: Knowledge of the intrinsic properties of embryonic stem cells is essential before the possibility of using these cells for therapeutic purposes becomes a reality. Insertional mutagenesis screens are widely used to identify genes sufficient to confer a particular phenotype; however, applications of such approaches are limited to mouse models. Thus, there is a need to develop a new DNA transposon system that allows gene discovery approaches in stem cells.


Introduction
Embryonic stem (ES) cells offer great hope for the future of medicine in such areas as cardiovascular disease, diabetes and Parkinson's disease. Of course, knowledge of the intrinsic properties of ES cells must be gained before the possibility of using these cells for therapeutic purposes becomes a reality. ES cells possess an unlimited capacity to be maintained as self-renewing undifferentiated cells. Their ability to grow for prolonged periods of time while maintaining a normal karyotype and pluripotency offers the enormous potential of differentiating them into any specific cell type of interest by manipulating their growth conditions. In vitro, ES cells are maintained in the presence of the cytokine leukemia inhibitory factor (LIF) as undifferentiated populations [1] that are capable of differentiating into various cell types. Upon removal of LIF, the ES cells can be easily induced to differentiate into spheroid cell aggregates termed embryoid bodies, recapitulating early developmental processes. These ES cell capabilities require the selective activation and repression of many genes or networks of regulatory genes [2,3]. Our current understanding of the genes that regulate the self-renewal and differentiation of stem cells is far from complete, and identifying these genes is critical for understanding the molecular basis of cell commitment. In addition, several genes known to play a role in human leukemogenesis have been identified in mouse ES cells, including Notch1, Flt3, Lmo2 and Nf1 [4]. Thus, identifying the genes that govern ES cells would provide new insights into proteins whose function was previously unknown.
Loss-of-function genetic screening is a powerful method for identifying the genes that are sufficient to confer a particular phenotype. A traditional approach for generating loss-of-function gene mutations has been the use of N-ethyl-N-nitrosourea (ENU) as a mutagen, but finding the causal point mutation in such systems is difficult and time consuming. Recently, RNAi has been shown to be effective in downregulating gene function; however, this strategy is labor intensive and is not a cost-effective approach. Alternatively, insertional mutagenesis by a retroviral vector derived from murine leukemia virus has proved to be a powerful gene trap method for generating loss-of-function mutations in ES cells but requires the generation of adult mice for functional analysis [5]. Several groups have recently used DNA transposons, such as Sleeping Beauty (SB) and Drosophila PiggyBac, for gene trap methods [6,7]. However, one limitation of the currently available SB transposons is the necessity of co-delivering the SB transposon with transposase-encoding DNA for the gene integration event. In addition, low transfer efficiency and a lack of sustained transposase expression are problems that have been reported to occur during cell culture [8,9]. Although the SB transposon is commonly used, other DNA transposons are currently being developed for gene discovery approaches. In this study, we explored the possibility of using long interspersed nuclear element 1 (LINE-1 or L1 retrotransposon) as a tool to deliver a gene trap in ES cells because of its efficient disruption of gene function.
The L1 retrotransposon is an insertional mutagen that is capable of inserting its sequence into a gene and disrupting the gene's function in individual cells or whole animals. L1 offers potential advantages over the SB and PiggyBac transposons, as L1 performs the random disruption of genes at a high frequency of insertion throughout the genome [10,11]. In addition, L1 mobilizes itself to a new genomic location by a 'copy and paste' mechanism, which offers an infinite source of insertional mutagens for efficient gene knockout throughout the genome. Furthermore, the L1 insertion is stable and permanent in all of the progeny of integrated cells, and the inserted sequence itself serves as a molecular tag to identify the disrupted gene [12]. These inherent features of L1 make it a valuable tool in ES cell gene discovery applications. We have created an episomal, nonviral L1 retrotransposition system using the scaffold/matrix attachment regions (S/MARs) in the vector backbone [13] and evaluated its utility in identifying genes in mouse ES cells.
Here, we demonstrate that, by utilizing this vector and coupled with GFP expression, we have successfully isolated 4 individual ES cell clones out of 50 clones screened that display disrupted genes, including one novel gene. We confirmed the identity of all of these genes by an inverse PCR method and verified their function in cell differentiation using undifferentiated markers of ES cells. The ease of using this insertional mutagenesis approach and the simplicity of identifying cells with disrupted genes by GFP expression make this L1 system a potential tool for ES cell gene discovery. Similar to other systems, this vector can also be applied to different kinds of stem cells or cancer stem cells to identify the genes that are responsive to certain cell growth conditions or involved in cancer development.

L1 expression vector
The construction of the S/MAR-based L1 retrotransposition vector has been described previously [13]. The vector DNA was amplified in E.coli DH10B cells and isolated using an endotoxin-free Maxi prep kit (Qiagen).

ES cell culture and transfection
Mouse ES cells were grown on mitomycin C-treated embryonic fibroblast (MEF) cell layers and were cultured in complete knockout Dulbecco's modified Eagle's medium (Invitrogen) with 4.5 mg/ml glucose, 2 mM L-glutamine, 0.1 mM β-mercaptoethanol, 15% FCS and 1000 U/ml leukemia inhibitory factor (LIF). The same growth medium (without LIF and MEFs) was used to differentiate the ES cells. Fresh glutamine was added when the medium was replaced daily. ES cell transfection was performed using the Amaxa ES nucleofector kit. Briefly, 2x10 5 ES cells were suspended in 90 µl Nucleofection solution and 10 µl Nucleofection solution containing 5 µg L1 vector added. This mixture was transferred to an electroporation cuvette and electroporated using the A-23 program of the Nucleofector 1 device (Amaxa Biosystems). Approximately 500 µl of the pre-warmed medium was added to the cuvette, and the ES cells were transferred to a 10-cm culture plate containing an MEF feeder layer. Neomycin (G418) was added to the medium 24 h after transfection at a final concentration of 175 µg/ml. The ES cells were subjected to neomycin selection for 7 to 8 days until colonies approximately 1 mm in diameter appeared.

GFP expression of ES clones
Using fluorescence microscopy, G418-resistant GFP-positive colonies were individually picked, partially digested with trypsin and transferred to a 24-well plate seeded with MEF feeder layers. After growing for 3 days, half of the ES cells derived from each clone were frozen as a stock, and the remainder was split into two wells of 24-well plates. One portion of the cells was grown in the medium with LIF and MEFs, and the other was grown without LIF and MEFs for 5 to 6 days until the formation of embryoid bodies. The ES clones that failed to differentiate in the absence of LIF and MEFs were harvested for further analysis.

PCR analysis of ES cells with retrotransposon insertions
Genomic DNA was isolated from each candidate ES clone using a QIAamp DNA kit (Qiagen), and PCR was performed using the Geno-5' (5'-TATTGCCGATCCCCTCAGAAGA-3') and Geno-3' (5'-CAAGGACGACGGCAACTACAAG-3') primers to examine whether GFP expression was a result of retrotransposed L1 insertion. The amplification was performed in 50-µl reactions containing 5 µl of 10X PCR buffer with 2.5 mM MgCl 2 , 1 µl of 10 mM dNTPs, 5 µM each primer and 250 ng genomic DNA. After an initial step at 95°C for 10 min, 30 cycles of amplification were performed (95°C for 30 s, 58°C for 15 s and 72°C for 2 min), followed by a final step at 72°C for 8 min. The amplified products were visualized on a 1.2% agarose gel. Genomic DNA from the wild-type ES cells and vector DNA were used as negative and positive controls, respectively.

Identification of the integration site by Inverse PCR
Approximately 1 µg genomic DNA was subjected to SspI restriction enzyme digestion at 37°C for 4 h. After heat inactivation, the digested DNA was treated with 20 units of T4 DNA ligase (NEB) for an overnight self ligation. About 250 ng of the ligation mixture was used directly as the DNA template for the inverse PCR reaction. The first round of the inverse PCR was performed using the following primers for the GFP expression: 5'-CTTGAAGAAGATGGTGCG-3' and 5'-ACAACCACTACCTGAGCACC-3'. We used a 5-µl aliquot from the first round of PCR as the template for the second round of nested PCR using the primers, 5'-TTGAAGAAGTCGTGCTGC-3' and 5'-AAAGACCCCAACGAGAAGCG-3'. The product was cloned directly into the Topo-XL (Invitrogen) vector and sequenced. The location of the L1 integration sites and disrupted genes were identified using the UCSC mouse genome analysis.

siRNA transfection and validation of mutations
Tollip-specific shRNAs (TGGACTCGTTCTACCTTGA, CTCT-GCCAAGATACTAGAA, and ATGGTGGTAACAAGCACTT in the pSiLv-mU6 vector were purchased from GeneCopoeia (product ID:MSH033160). Each shRNA was transfected independently into wild-type mouse ES cells using an Amaxa Nucleofector kit as described above. The depletion of Tollip mRNAs was confirmed by qRT-PCR. The expression of the mCherryFP gene, which is present in the vector, was used as a marker to identify the transfected ES colonies, which were transferred to 24-well plates and grown in medium with or without LIF and MEFs for 5 days. Differentiated and undifferentiated ES cells were subsequently subjected to immunofluorescence staining of Nanog, Oct4, and SSEA-1 pluripotency antibodies, as described by the manufacturer's instructions (Abcam).

Construction of an episome vector expressing the L1 retrotransposon that allows GFP expression and G418 drug selection
As an insertional mutagenesis system, we constructed an S/MARbased L1 retrotransposon vector that consisted of a full-length active human L1 gene (L1 RP ) under the control of the cytomegalovirus (CMV) immediate early promoter, which has been reported to be transcriptionally active in a variety of undifferentiated mouse ES cells [14]. The 3'-UTR of the L1 gene harbors a visual GFP marker disrupted by 960bp of a γ-globin intron in an antisense orientation ( Figure 1A). GFP is co-transcribed as a single fusion transcript due to the presence of the splicing sites in the intron sequences. This arrangement ensures that GFP expression occurs only after L1 mobilization or insertional mutagenesis i.e., after L1 expression, γ-globin intron splicing, reverse transcription, and integration of the L1 copy into the genomic DNA ( Figure 1B). Therefore, no GFP reporter is expressed unless the newly synthesized L1 is integrated into a new genomic location, and observation of the expression of GFP under a UV light would allow us to detect a real-time L1 disruptional event in living ES cells without cell staining ( Figure 1C). In addition, this vector also contains a neomycin resistance gene that can be used to select transfected ES clones. To ensure the stability and the integrity of this vector, a DNA fragment containing scaffold/matrix attachment regions (S/MARs) was added to the vector backbone. We have recently shown that the inclusion of S/MARs effectively maintains a single copy of the vector in cells throughout multiple rounds of cell division without undergoing epigenetic silencing by DNA methylation or a loss of the vector, even in the absence of G418 drug selection [13]. We introduced the S/MAR-L1-GFP vector into mouse ES cells using nucleofection and, as determined by a fluorescence-activated cell sorting (FACS) analysis, achieved 2.4 ± 0.26% GFP-positive cells, compared to 1.7 ± 0.12% GFP-positive cells using an EBNA-based RP99-L1-GFP vector carrying the same L1-GFP expression cassette. This high level of retrotransposition from S/MAR-L1-GFP in ES cells is consistent with our previous studies [13], which demonstrated that the S/MAR-based L1 vector system increases the L1 retrotransposition events during the culturing of human somatic cells.

Selection of disrupted genes in ES clones that fail to differentiate
Undifferentiated ES cells contain reduced levels of global DNA methylation and active or open chromatin structures [15]. Under these conditions, the expression of the L1 retrotransposon results in the random insertion of a copy into the host genome. When L1 inserts into a gene, the protein encoded by that gene should be truncated and its function disrupted [16]. If the disrupted gene is essential for self-renewal and differentiation, then the ES cells may remain in an undifferentiated state even when subjected to differentiation-inducing conditions, such as the absence of LIF. Interestingly, L1 is rarely expressed in terminally differentiated cells. This approach should allow us to discover the potential genes that regulate ES cell renewal and differentiation.
To test this approach, we transfected mouse ES cells derived from C57BL/6 mice; the approach used to isolate the candidate gene in the ES cells is shown in Figure 2. An S/MAR-based L1 retrotransposition (pS/MAR-L1-GFP) was transfected into ES cells using the Amaxa ES Nucleofector Kit and was grown on a layer of feeder fibroblast cells (MEFs). The G418-resistant clonies were selected with 175 µg/ml neomycin for 7 days. Using fluorescence microscopy, each G418-resistant GFP-positive colony was picked individually and transferred to a 24-well plate seeded with MEF feeder layers. It should be emphasized that the GFP expression from the L1 vector occurs only after the successful insertion of the L1 copy into a new genomic location. In this study, we screened approximately 50 ES cell clones that displayed resistance to G418 in a single 10-cm plate. Approximately 85% of the G418-resistant clones we examined did not show any GFP vector approaches in which at least 10 insertional colonies have been reported in a single 10-cm dish [17]. In total, we identified only 9 GFPpositive, G418-resistant clones in a 10-cm plate. Each GFP-positive colony was maintained as an independent clone in culturing MEFs in replicates where one set was cryopreserved and the others were used to determine whether the cells retained the ability to differentiate. ES cells undergo in vitro cell differentiation in the absence of LIF and MEFs.
The selected GFP clones were grown in the absence of LIF and MEF feeder cells for 5 to 6 days, either in adherent monoculture or in suspension culture, for the formation of embryoid bodies (EBs). If a gene putatively involved in differentiation is disrupted by the L1 insertion, these cells will lose the ability to differentiate into the EBs. For comparison, untransformed wild-type ES cells or pEGFP-N1-expressing ES cells were used as the controls. The failure of the ES clones to differentiate was simply assessed by a morphological analysis using an inverted phase contrast microscope, thus narrowing down the number of potential ES clones into a manageable number for further genetic and phenotypic analysis. In the absence of LIF and MEFs, we found that 4 out of 9 individual GFP-positive cells remained undifferentiated. In contrast, the other five GFP-positive cells became differentiated within 4 days, similar to the wild-type ES cells without LIF and MEFs; we reasoned that these cases might be  due to the insertion of L1 into genes or noncoding intergenic regions that are not essential for ES cell differentiation. In fact, analyzing the L1 integration site of one of the GFP-positive ES differentiated cells showed that L1 was located in the intergenic region of chromosome 11 (112835881-112836059), a site with no genes nearby. A blast analysis of this genomic sequence showed that this region contained only a hypothetical protein, LOC72386, which is located 70 kb downstream of the L1 integration site (data not shown).
The intensity of the GFP fluorescence of the ES clone and the degree of cell differentiation upon ES cell differentiation varied among the candidate clones. One of the ES clones (named A8) that showed weak GFP expression was found to exhibit a delayed partial cell differentiation, whereas the other clones (named A2, B1 and B9) completely failed to undergo any cell differentiation even after 4 weeks of culturing in the absence of LIF ( Figure 3A). To determine whether the defective ES cell clone was a result of the L1 insertion, we conducted a PCR-based analysis to confirm the presence of spliced GFP in these cells. The genomic DNA isolated from the defective ES clones was subjected to a GFP-specific PCR assay, as shown in Figure  1B. An oligonucleotide primer specific for the GFP gene was used to determine whether the intron had been removed by γ-splicing during the L1 insertional events; the unspliced PCR product is 1491 bp rather than the expected 531 bp in the case of a successful L1 insertion. This simple strategy allowed us to confirm the presence of retrotransposed L1 insertions in the defective ES cell colonies.

Identification of the disrupted candidate genes in the ES clones
To identify the disrupted genes in the ES clone, we used an inverse PCR analysis. Approximately 1 µg genomic DNA was isolated from the candidate ES cell clones and was digested with SspI followed by self-ligation with T4 DNA ligase. The resulting products were subjected to inverse PCR amplification using a primer complementary to the 3'-UTR of the L1 sequence (forward primer) and another primer corresponding to the downstream region of the GFP sequence (reverse primer). As the sequence of human L1 is completely different from the mouse version, the inverse primers should specifically amplify only the integrated L1 insert. The PCR product was subsequently cloned into the TOPO-XL vector for DNA sequencing, which enabled us to identify the flanking sequences of the disrupted genes in the genomic DNA using the UCSC mouse genome browser ( Figure 3B). Using this strategy, we identified the L1 integration sites and their corresponding disrupted genes for the candidate ES clones that displayed the loss of ability to differentiate in the absence of LIF and MEFs, as illustrated in Table 1. Although L1 normally results in a single insertion event, we found that it occasionally integrates into multiple sites of the genome.
Two of the identified genes resulting from the L1 insertions are Arp6-actin-related protein (Actr6) (A2) and GATA-binding protein 4 (Gata4) (B9), which were previously shown to be necessary for ES cell differentiation. The Arp6 actin-related protein is a subunit of chromatin-remodeling factor, Tip60. Using high-throughput RNAi screening, Tip60 was recently shown to be essential for ES cell development through its functional overlap with Nanog [18]. Depletion of Tip60 was also found to reduce the levels of the cell-cycle regulators and metabolic genes required for ES cell division. Gata4 is one of the well-characterised transcription factors that is expressed in ES cells [19]. The L1 integration site in the third ES clone (A8), which exhibited delayed partial cell differentiation, matched three different sites; one insertion was identified as the intron of the tripartite motifcontaining 59 protein (TRIM59), and the remaining two sites occurred in the intergenic regions of chromosomes 7 and 11. Because this ES clone (A8) contains more than one L1 integration site, we did not pursue it further because the complexity and heterogeneity of multiple disruptions make them difficult to analyze. Interestingly, the L1 insertion site in the fourth clone (B1) is a candidate gene that has never been reported to play a role in ES cell differentiation. The L1 insertion is located in exon 2 of the Toll-interacting protein (Tollip) gene. Although little is known about this gene in ES cell differentiation, Tollip is known to interact with Toll-like receptors (TLR) to inhibit TLR-mediated signaling pathways [20,21]. In addition, Tollip plays a key role in cellcell signaling, inflammatory cytokine production, and intracellular signaling pathways. Thus, further studies are required to delineate the exact function of this gene in stem cell differentiation. Nevertheless, the data presented here suggest that the L1 retrotransposon system can identify previously unknown genes involved in mouse ES cell differentiation.

Confirming the function of the disrupted genes in ES cell differentiation
To confirm whether ES cell differentiation is indeed dependent on the function of the identified genes, we resorted to the use of shRNAs against each gene in wild-type ES cells, followed by staining with alkaline phosphatase (AP), which is known to be expressed at a high level in the cell membrane of undifferentiated ES cells [22]. As the expression of AP is often used as an indicator of the undifferentiated state of ES cells, we utilized this simple staining approach to validate the undifferentiated state of the ES cell clones. Using wild-type ES cells, we introduced three Tollip-specific prevalidated shRNAs independently into the cells and, as determined by qRT-PCR analysis, achieved 78 ± 6% knockdown efficiency of the Tollip gene compared to the wild-type ES cells (data not shown). The ES cells transfected with each shRNA against the Tollip gene were cultured in the presence and absence of LIF, followed by AP staining. In this assay, the Actr6 mutant ES cell (clone A2) was used as a known positive gene. The Tollip-knockdown ES cells showed high levels of AP activity in the presence or absence of LIF and MEFs similar to the expression pattern observed with other undifferentiated ES cell markers [23,24], including Oct4, Nanog, and SSEA-1 ( Figure  3C). Additionally, we employed a semi-quantitative RT-PCR analysis to evaluate whether the ES cell markers showed similar expression levels in response to the removal of LIF. The expression levels of the Nanog and Oct4 markers remained similar in both the presence and  Figure 3D). As expected, under the standard in vitro-directed cell differentiation conditions, the Tollip-knockdown ES cells failed to undergo cell differentiation in the absence of LIF. Taken together, these results suggest that Tollip plays an essential role in ES cell differentiation.
At present, little is known about the function of Tollip or its regulatory pathways that direct stem cell differentiation. Tollip is known to act in the immune response to invading pathogens by controlling IRAK phosphorylation in the TLR and IL-1R signaling pathways [20]. Tollip is also known to suppress TLR-mediated NF-κB activity. A recent study in mesenchymal stem cells suggests that TLR4, which is the receptor for Tollip, inhibits the activation of transcription factor STAT3 (signal transducer and activator of transcription 3) and thereby exerts deleterious effects on stem cell proliferation [25]. As STAT3 mediates self-renewal and the maintenance of pluripotency in the absence of LIF, it is essential for ES cell differentiation [26]. Several molecules, such as Zfp57, GABP and β-catenin, have been identified as the factors involved in the LIF/STAT3 pathway. Thus, it is tempting to speculate that the disruption of Tollip may inactivate STAT3 through TLR receptors, resulting in the lack of proper ES cell differentiation. However, further studies are required to delineate the exact molecular mechanisms of this gene. Nonetheless, the data presented in this study show that the L1 retrotransposon approach can identify both novel genes and known genes that regulate ES cell differentiation.

Conclusion
Embryonic stem cells have a relatively stable genome and are highly amenable to loss-of-function genetic screens. The recent development of the L1 retrotransposon approach offers an efficient and highly versatile alternative for achieving the complete disruption of gene function. The ease of using this insertional mutagen and the simplicity of identifying the cells with disrupted genes by GFP expression make this L1 vector a promising tool to identify the genes that play roles in cell growth, morphology, differentiation and proliferation. Given that ES cells and other stem cells share many similarities, including the ability to self-renew, pluripotency and virtually identical chromatin states, this system can also be applied to identify critical genes in other stem cells, such as muscle stem cells, pluripotential stem cells or cancer stem cells.