Received date: August 13, 2013; Accepted date: September 18, 2013; Published date: September 20, 2013
Citation: Powers JM, Trobridge GD (2013) Identification of Hematopoietic Stem Cell Engraftment Genes in Gene Therapy Studies. J Stem Cell Res Ther S3:004. doi:10.4172/2157-7633.S3-004
Copyright: © 2013 Powers JM, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Stem Cell Research & Therapy
Hematopoietic stem cell (HSC) therapy using replication-incompetent retroviral vectors is a promising approach to provide life-long correction for genetic defects. HSC gene therapy clinical studies have resulted in functional cures for several diseases, but in some studies clonal expansion or leukemia has occurred. This is due to the dyregulation of endogenous host gene expression from vector provirus insertional mutagenesis. Insertional mutagenesis screens using replicating retroviruses have been used extensively to identify genes that influence oncogenesis. However, retroviral mutagenesis screens can also be used to determine the role of genes in biological processes such as stem cell engraftment. The aim of this review is to describe the potential for vector insertion site data from gene therapy studies to provide novel insights into mechanisms of HSC engraftment. In HSC gene therapy studies dysregulation of host genes by replication-incompetent vector proviruses may lead to enrichment of repopulating clones with vector integrants near genes that influence engraftment. Thus, data from HSC gene therapy studies can be used to identify novel candidate engraftment genes. As HSC gene therapy use continues to expand, the vector insertion site data collected will be of great interest to help identify novel engraftment genes and may ultimately lead to new therapies to improve engraftment
Engraftment; Gene therapy; Hematopoietic stem cell; Viral vector; Insertional mutagenesis
Gene therapy using hematopoietic stem cells (HSC) has enormous potential to treat diseases of the hematopoietic system including immune diseases. In this approach, HSCs are collected from a patient, gene-modified ex vivo using integrating retroviral vectors, and then infused into a patient. To date retroviral vectors have been the only effective gene delivery system for HSC gene therapy. This is primarily due to the ability of retroviral vectors to efficiently integrate into the genome, thereby allowing efficient transmission of therapeutic transgenes to all HSC-derived cells via mitosis. Gene delivery to HSCs using integrating vectors thus allows for efficient delivery to HSCderived mature hematopoietic cells.
Retroviral vectors have been used successfully in HSC gene therapy clinical trials for several genetic diseases including X-linked severe combined immunodeficiency (SCID-X1) [1,2], adenosine deaminase deficiency (SCID-ADA) [3,4], chronic granulomatous disease (CGD) , and adrenoleukodystrophy (ALD) . HSC gene therapy also has the potential to treat acquired diseases of the hematopoietic system such as human immunodeficiency virus infection and acquired immunodeficiency syndrome (HIV/AIDS) . While recent clinical studies have shown promise, the use of retroviral vectors for gene therapy has drawbacks. Gene therapy using HSCs with integrating retroviral vectors can dysregulate cellular genes near the provirus integration site leading to adverse side effects including leukemia [8-10].
Previous human clinical studies have documented the impact of vector-mediated dysregulation of host genes. In both the French and United Kingdom SCID-X1 studies vector-mediated gene dysregulation resulted in the development of leukemia [8-10]. In a CGD study conducted by Ott and colleagues, proviral insertion sites led to the clonal expansion of gene-modified cells over time [5,11]. In this CGD study the vector provirus provided the gene-modified HSCs with a survival advantage, leading to the clonal dominance of a small subset of gene-modified cells in the patient. In the above SCID-X1 and CGD studies, the ability to determine where the provirus had inserted into the genome allowed for the identification of nearby genes that were dysregulated, leading to clonal expansion. The integrated provirus can thus be used as a molecular tag to identify dysregulated genes in gene therapy studies.
Gene-modified HSCs that are infused into patients undergo various selective pressures during the process of stem cell engraftment. First, the cells must home to the stem cell niche and resist apoptosis during this process. Once in the bone marrow, HSCs begin the production of all hematopoietic cell lineages which involves survival, stem cell self-renewal, proliferation and differentiation. Together, these processes are referred to as engraftment , and many genes could potentially provide a selective advantage to repopulating cells if dysregulated. The gene-modified cells that are infused into a patient are a polyclonal population, where different cells have vector proviruses integrated at different chromosomal locations. There may be millions of clones that are infused into a patient and this polyclonal population of cells is, in essence, a library of clones with many different unique integration sites. If a clone has a vector integrant near a gene that may influence the efficiency of engraftment, that clone has a selective advantage and may be over-represented when engrafted cells are analyzed (Figure 1). Thus, pre-clinical and clinical HSC gene therapy studies provide an opportunity to identify genes near vector proviruses in over-represented clones. These genes may have conferred an increased survival and proliferation advantage to the infused cells due to dysregulation mediated by the integrated provirus.
Figure 1: Selective pressure for HSCs to engraft enriches for clones with proviral integration sites that confer an engraftment advantage. After harvesting patient HSCs the cells are transduced with retroviral vectors, leading to a polyclonal population of cells with numerous different proviral insertion sites. Following transfusion of the cells into the patient, cells with insertions near genes that confer a competitive engraftment advantage (red, purple clones) will become enriched. Provirus vector integration sites in the purple and red cells are thus over-represented.
This review covers the potential of HSC gene therapy studies to identify genes that play a role in engraftment. The use of retroviral mutagenesis screens to identify dysregulated genes involved in cancer has provided an enormous wealth of data . These screens have been used to identify genes that have an effect on the development and progression of leukemia by analyzing replicating virus insertion sites to identify nearby genes that contributed to tumorigenesis and leukemic development . However it is clear that non-replicating viruses can also perturb nearby genes causing genotoxicity. Thus, HSC gene therapy studies are de facto mutagenesis screens where a library of vectormutagenized cells are infused into patients and clones with a selective advantage to engraft can become over-represented. Although the goal of clinical gene therapy is to develop cures for life-threatening diseases, the data obtained from patient samples can also provide information into the role of genes in hematopoietic processes. Analysis of retroviral integration sites in preclinical and clinical HSC gene therapy studies has the potential to identify novel genes involved in engraftment, and also other hematopoietic processes. Identifying novel engraftment genes can improve our understanding of this complex process, and also identify new therapeutic targets to enhance engraftment.
HSCs are commonly harvested from the peripheral blood after mobilization. In order to mobilize HSCs from the bone marrow into the peripheral blood, patients receive recombinant human granulocytecolony stimulating factor (G-CSF). The patient’s peripheral blood is collected and enriched for HSCs using the CD34+ marker. HSCs are then cultured ex vivo and exposed to viral vectors. The ex vivo culture period varies between studies, but can be for approximately 1-4 days. During this time, vector proviruses integrate into the host genome, leading to a polyclonal population of HSCs that possess numerous proviral insertion sites. This highly polyclonal population of repopulating cells with vector proviruses at many integration sites is in essence a library where there is the potential to dysregulate a wide variety of genes. Some proviral integration sites may become over-represented during ex vivo culture due to a proliferative/survival advantage of clone(s) with this provirus.
Prior to the infusion of gene-modified HSCs, patients may be treated with chemotherapy agents or irradiation to help enhance the engraftment efficiency. Gene-modified HSCs are re-infused into the patient intravenously. The cells migrate into the bone marrow before finally residing in the sinusoids and perivascular tissue [15,16]. Both homing and hematopoiesis are integral aspects of engraftment. Cells that have reached the stem cell niche through homing will begin producing mature myeloid and lymphoid cells from each blood lineage. Hematopoiesis continues through the action of long-term HSCs, which are capable of self-renewal for life-long generation of the patient’s mature blood cells.
When HSCs are infused into the patient intravenously, the cells must travel from the peripheral blood into the bone marrow, eventually reaching their niche to repopulate the blood system. This process, known as homing, is a multistep process that relies on the action and interactions of various chemokines, cytokines and other proteins. Examples include stromal derived factor 1 (SDF-1) and CXCR4, adhesion molecules such as very late antigen 4 and 5 (VLA- 4/5), lymphocyte function associated antigen 1 (LFA-1), and α4β1 integrin interaction with vascular cell adhesion protein 1 (VCAM-1) [12,17-19]. Circulating HSCs roll and tether to the blood vessel walls through the action of E-and P-selectins and VCAM-1. Tethered HSCs extravasate through the bone marrow endothelium before lodging into the bone marrow stem cell niche (Figure 2) [15,20-23]. The entire process is thought to occur within a matter of hours following infusion . HSCs with vector provirus insertions near genes that enhance homing are more likely to engraft and thus these clones may become over-represented during this process.
Figure 2: Infused HSCs must home to their bone marrow niches before they can begin the process of hematopoiesis. After infusion of HSCs into the peripheral blood, shown as purple circles, HSCs begin the process of homing to the marrow. E- and P-selectins and VCAM1 on the vessel walls tether circulating HSCs and allow for rolling on the vessel wall to occur. This is followed by extravasation of the HSCs through the extracelluar matrix into the bone marrow. The release of SDF-1 from osteoblasts and epithelial tissues in the bone marrow binding to the HSCs CXCR4 receptors is important. After reaching the bone marrow, HSCs then migrate to the perivascular regions and begin the process of hematopoiesis.
After reaching the bone marrow and lodging in the perivascular region, HSCs begin the process of repopulating the patient’s blood system. During the process of proliferation, some of the daughter cells produced by the infused HSCs remain as quiescent HSCs, while others self-renew or become committed to either the myeloid or lymphoid system as progenitor cells . As gene-modified daughter cells divide, they begin to produce all of the cellular subsets of each lineage, with all progeny carrying the transgene of interest. For HSCs that harbor proviral integrations near genes involved in stem cell renewal or expansion, dysregulation may provide the HSCs with an engraftment advantage. HSCs that have vectors integrated near genes that provide a selective advantage during these processes of self-renewal or expansion will be more likely to engraft, repopulate, and persist in the patient long-term. Examples of such genes include RUNX1 [25-27], globin transcription factor 2 (GATA2) [28,29], spleen focus forming virus proviral integration oncogene (Spi-1), the transcription factor PU.1 [30,31], as well as homeobox A (HOXA) [32,33].
HSC clones that have vector proviral insertions that dysregulate genes involved with proliferation or survival have a selective advantage at all stages of engraftment. In order for infused cells to engraft and repopulate the patient’s blood system they must make it to the bone marrow without undergoing apoptosis. Dysregulation of genes that confer a survival advantage by inhibiting apoptosis, such as MCL1, could benefit HSCs prior to reaching and after lodging in the bone marrow niche [34,35]. Clones with dysregulated genes that provide a proliferative advantage to HSCs, such as CCND2, have been overrepresented in gene therapy studies.
Integrated vector proviruses have the potential to dysregulate the expression of nearby host cell genes flanking the integration site . Depending on the integration site of the provirus, vectormediated genotoxicity can lead to gene over-expression, inactivation, or production of novel gene transcripts (Figure 3). Transcriptionally active LTR regions with strong promoters or enhancers are important in the development of genotoxicity. Integrating replication-competent retroviruses are well known for their potential to activate nearby genes leading to oncogenesis. However, it was previously believed that replication-incompetent viral vectors might not mediate significant genotoxicity. Unfortunately, clinical studies have shown that replication-incompetent vectors still cause genotoxicity, in some cases leading to clonal expansion and leukemia.
Figure 3: Mechanisms of insertional mutagenesis. (1) 3’ proviral LTRs can drive over-expression of nearby genes. (2) Enhancers in the LTRs can activate nearby promoters leading to increased transcription. (3) Proviral insertion within a host gene and transcription from the 5’ LTR can lead to the creation of novel gene transcripts. (4) Premature polyadenylation of host cell gene transcripts can be caused by proviral insertion within a gene. Black boxes represent the host gene promoter and grey squares represent the exons. Grey boxes containing white rectangles represent proviral LTRs and striped rectangles are used to show proviral transgenes.
In order to identify integration sites, genomic DNA is extracted from the bone marrow or the peripheral blood of patients that have received gene-modified HSCs. After isolation of the DNA, the amplification of provirus LTR-chromosome junctions is commonly conducted using ligation-mediated PCR (LM-PCR) or linear-amplification-mediated PCR (LAM-PCR) . LM-PCR utilizes frequent cutting restriction enzymes that cut genomic DNA into small fragments. Some of these fragments contain an LTR-chromosome junction. Following digestion, these fragments are then ligated to linkers and PCR amplified. LAMPCR employs linear amplification of LTR-chromosome junctions followed by double-stranded DNA (dsDNA) synthesis. The dsDNA sequences are then digested with restriction enzymes, followed by linker ligation to the sequence and nested PCR. Non-restrictive linearamplification- mediated PCR (nrLAM-PCR) has been developed which avoids restriction digest bias of recovered integration sites .
Alternative non-PCR methods, such as shuttle vector rescue, also exist . In shuttle vector rescue, integrated vector proviruses contain a bacterial origin of replication and a selection gene. Peripheral blood DNA from patients is digested with restriction enzymes or randomly sheared, ligated, and then transformed into bacteria which are grown as colonies. These plasmids contain an LTR-chromosome junction that can be sequenced with an LTR specific primer. Shuttle vector rescue avoids PCR-based skewing of obtained integration sites.
The availability of the human genome sequence, as well as the genomes of other model organisms such as mice and macaques has allowed for rapid identification of genes near vector proviruses in clinical and preclinical studies. Following sequencing of the LTRchromosome junction, sequence reads can be aligned to the human genome using the BLAST-like alignment tool (BLAT) . Genes and oncogenes located close to the vector integration site can be identified based on the annotation of the human genome. Thus, through the combination of LTR-chromosome junction amplification, nextgeneration sequencing, and bioinformatics, vector proviruses serve as ideal molecular tags to identify nearby genes.
Proviral vector integration occurs throughout the genome, but different viral vector types have different integration site preferences. HIV based lentiviral vectors favor active genes, while murine leukemia virus vectors (MLV) favor transcription start sites [41,42]. Gammaretroviruses, such as MLV, have a strong preference for integration sites involving previously identified common integration sites (CISs) in the retroviral tagged cancer gene database (RTCGD) . The RTCGD is composed of retroviral integration site data acquired from mouse tumors from a variety of different studies and tumor types . The RTCGD allows researchers to identify candidate cancer genes dysregulated by proviruses that may play a role in human cancer development and progression [13,45].
HSC gene therapy trials utilizing MLV and lentiviral vectors have shown that proviral insertions are observed in specific classes of genes [46-50]. Both MLV and lentiviral vector proviruses are over-represented near genes involved in the establishment and/or maintenance of chromatin architecture, signal transduction, and cell cycle . Lentiviral vector proviruses were also over-represented near genes involved in chromatin remodeling and phosphorylation. Many of the genes identified in retroviral mutagenesis screens are linked in gene networks involved in cellular regulatory process such as apoptosis, signal transduction, and transcriptional regulation . The over-representation of vector provirus near genes involved in such processes is likely due to the survival and proliferative advantages that such mutations could confer to HSCs. For example, in a retrospective study of vector integration sites in rhesus macaques that had received autologous MLV transduced hematopoietic repopulating cells, the MDS/EVI1 site was identified as a hot spot of vector insertion . It is likely that vector provirus dysregulation of this locus provided the infused cells with the potential for increased survival, proliferation, or both. Studies have shown that the overexpression of EVI1/MDS1 has the potential to delay or inhibit the myeloid differentiation of HSCs, while increasing the proliferation of HSCs and progenitor cells . This has also been reported for mouse and monkey HSCs [46,54,55]. Proviral integration leading to the dysregulation of the EVI1/MDS1 gene complex can lead to the over-expression of either or both genes, inhibiting cellular differentiation. Dysregulation of this locus has been shown to be involved in clonal expansion and leukemic development, with integration sites likely providing a survival or proliferation advantage to transduced HSCs. Extended culture of macaque HSCs revealed an increase in HSC clones with integration sites in or near the EVI/MDS1 locus compared to other infused clones . Thus, analysis of vector provirus integration sites can provide evidence for dysregulated genes in the absence of adverse events. This data demonstrates the potential of preclinical gene therapy studies to identify genes involved in engraftment and hematopoiesis pathways, as well as their role in gene networks related to these processes.
Retroviral mutagenesis screens have played an important role in determining genes involved in hematopoiesis. Forward retroviral mutagenesis screens in hematopoietic cells have been highly successful in identifying genes involved in migration, proliferation, and expansion. Identified genes include Rac2, Jak/Stat, and Notch [44,56]. Notch expression is important in embryonic development, and throughout life for tissue homeostasis, . Dysregulated expression of Notch can affect HSC cell differentiation and lead to skewed differentiation of hematopoietic lineages . A study of murine tumor retroviral insertion sites by Suzuki and colleagues identified Notch as a CIS . Based on the role of Notch family genes in hematopoietic differentiation, dysregulation by proviral insertional mutagenesis has the potential to enhance hematopoietic repopulation.
The role of GATA proteins, especially GATA-1 and GATA-2, is also well established in HSC biology. Both are highly expressed in erythroid precursors. As cells differentiate the GATA-2 level decreases while GATA-1 expression is maintained at high levels . GATA-2 expression is essential for HSC maintenance, survival, and proliferation [29,61,62]. Since GATA-2 expression levels are important in HSC proliferation and differentiation, dysregulation of GATA-2 expression would be expected to enhance engraftment following transplantation of gene-modified cells. GATA2 has in fact been identified as a CIS .
Replication-competent retroviruses cause insertional mutagenesis, leading to their common use in mutagenesis screens [13,63]. Although replication-incompetent vectors are capable of providing only a single-hit genetic modification via provirus integration, they can still be utilized to identify dysregulated genes. Deichmann and colleagues investigated integration site data from five clinical gene therapy trials and three pre-clinical trials . This retrospective analysis showed that transplanted gene-modified HSCs had very similar integration sites and dysregulated genes. The most frequent CISs were insertions that would dysregulate genes leading to clonal expansion or leukemic development, such as LMO2 and MDS1/EVI1. Thus, the same dysregulated genes are often observed in multiple gene therapy studies.
The CGD study by Ott and colleagues revealed that the dysregulation of PRDM16 and EVI1/MDS1 caused clonal expansion . The expression of PRDM16 has been shown to be involved in HSC maintenance and renewal. Cells lacking expression of PRDM16 exhibit increased cell death so overexpression is expected to lead to an over-representation of clones with proviral insertions near PRDM16. PRDM16 may be in a gene network involving MDS1/EVI1, GATA2, and other genes that affect HSCs [65,66]. Thus, dysregulation of the PRDM16 gene locus could have an effect on the signaling pathways for other genes involved in normal hematopoiesis, expanding the effects of dysregulation of the PRDM16 gene. During the French SCID-X1 study the dysregulation of LMO2 likely led to the proliferation of common lymphoid progenitor cells. Over time, dysregulation of LMO2 led to the expansion of the lymphoid hematopoietic lineage. LMO2 is expressed only in the earliest stages of lymphopoiesis, with the continued expression in mature T-lymphocytes leading to the development of lymphoblastic leukemias. In the French SCID-X1 study, the dysregulated expression of LMO2 ultimately resulted in lymphoblastic leukemia [67,68].
With the large proviral integration site data sets that gene therapy trials can provide, the ability to quickly and efficiently analyze the integration profiles sites should help to identify candidate engraftment genes. One such utility is the QuickMap utility provided by the gene therapy safety group (GTSG) . The QuickMap utility relies on cancer gene lists provided by the Catalogue of Somatic Mutations in Cancer (COSMIC)  as well as the RTCGD. The QuickMap utility is able to rapidly analyze sequence data from LTR-chromosome junctions to determine the proviral integration site. Once the integration site is known, it can identify if the vector provirus is within a gene including known oncogenes, within a CpG island, or in a repetitive DNA sequence. Further, the software compares the integration site data to a randomly generated data set of one million integrations as a control. This database has been utilized previously with ex vivo transduced human cells to explore the effect of chemoselection of HSCs on integration site patterns . Within the analyzed data, two of the sixteen CISs identified, STAT5B and TNRC6C were previously identified as CISs.
One limitation of the QuickMap utility is the submission limit of fifty-thousand sequences per analysis, although this limit can be temporarily increased by contacting the GTSG. As many gene therapy studies now use next generation sequencing where sequence reads can be in the hundreds of thousands to millions this restriction may limit future use of QuickMap. With the increasing number of pre-clinical and clinical HSC gene therapy trials, the development of new utilities to efficiently analyze millions of integration site sequence reads from next generation sequencing may aid in the discovery of additional CISs in HSCs. These CISs may in turn identify novel engraftment genes.
A study by Kiem and colleagues of three baboons that received baboon hematopoietic repopulating cells exposed to a gammaretroviral vector revealed a CIS of 664 base pairs in a CpG island that existed between zinc finger protein 91 (ZFP91) and leupaxin (LPXN) . It was hypothesized that the CIS between ZFP91 and LPXN lead to the dysregulation of one or both genes, providing the HSCs with an engraftment advantage. Thus these two genes may play a role in normal engraftment pathways. This study suggests that other HSC gene therapy trials may identify CISs that are near genes including micro- RNAs previously not associated with engraftment. Previous studies regarding the roles of genes involved in the engraftment process have been utilized to improve HSC transplantation . These identified genes could serve as targets for novel small molecule drugs to increase the gene expression of the identified targets prior to HSC infusion. These drugs could be of benefit to patients receiving any type of HSC transplantion, and may be of significant value in the field of cord blood transplantation where low cell numbers and low engraftment limit clinical use .
The ability of retroviral vectors to dysregulate genes can be exploited to better understand many other biological processes. If a library of cells mutagenized with retroviral vectors is placed under any selective pressure, those clones with integrants near genes that provide a selective advantage will be enriched. For example, it should be possible to analyze over-represented genes in specific lineages of hematopoietic repopulating cells. If a set of genes is overrepresented near vector proviruses in myeloid but not lymphoid repopulating cells those genes are candidates for affecting myeloid differentiation and expansion. There are many possible uses of this technology. Replicationincompetent vectors have been used to identify genes involved in liver cancer  and we are using this approach to study the development of acute myeloid leukemia (GDT unpublished data). Further, retroviral integration sites could provide insight into genes that play a role in the metastasis of solid tumor cells to the bone marrow, such as in prostate cancer . Analysis of gene expression in cancer cells that have metastasized to the bone marrow could provide insight into genes that helped them engraft in the bone marrow. Identification of genes that assist in homing and engraftment would be potential molecular targets to reduce the likelihood of metastasis to the bone marrow. Thus, the data obtained from mutagenesis screens using replication-incompetent vectors should provide useful information about the physiological role of genes and their interactions in gene networks for many biological processes including cancer.
Pre-clinical and clinical trials utilizing HSCs with retroviral vectors have yielded important information regarding the effects of retroviral insertional mutagenesis on host genes. Through the use of annotated genomes for humans and model organisms, retroviral insertion sites in gene therapy trials can be mapped to the genome to determine nearby potentially dysregulated genes. Advances in bioinformatics, and the creation of cancer gene databases, such as the RTCGD, have been instrumental in identifying CISs and thus dysregulated genes.
As the number of HSC gene therapy trials increases more data regarding the role of genes in biological processes will be obtained. The data from these studies can be mined to identify genes that provide a competitive engraftment advantage to infused HSCs. Studies without observed abnormal hematopoiesis following engraftment still have the potential to identify genes that have an effect on hematopoiesis and engraftment. Novel engraftment genes might be targeted with small molecule drugs to increase the engraftment efficiency of infused HSCs. Therefore, HSC gene therapy trials carry the potential to improve HSC transplantation by providing data that identifies genes and gene networks involved in engraftment and hematopoietic pathways.
This publication was supported by the National Institutes of Health award numbers, AI097100, AI102672 and CA173598 (GDT).