Widespread Exonization of Transposable Elements in Human Coding Sequences is Associated with Epigenetic Regulation of Transcription
- Corresponding Author:
- Pierre R Bushel
Microarray and Genome Informatics Group Biostatistics Branch
National Institute of Environmental Health Sciences, USA
E-mail: [email protected]
Received March 01, 2013; Accepted June 17, 2013; Published June 19, 2013
Citation: Huda A, Bushel PR (2013) Widespread Exonization of Transposable Elements in Human Coding Sequences is Associated with Epigenetic Regulation of Transcription. Transcriptomics 1:101. doi: 10.4172/2329-8936.1000101
Copyright: © 2013 Huda A, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Background: Transposable Elements (TEs) have long been regarded as selfish or junk DNA having little or no role in the regulation or functioning of the human genome. However, over the past several years this view came to be challenged as several studies provided anecdotal as well as global evidence for the contribution of TEs to the regulatory and coding needs of human genes. In this study, we explored the incorporation and epigenetic regulation of coding sequences donated by TEs using gene expression and other ancillary genomics data from two human hematopoietic cell-lines: GM12878 (a lymphoblastoid cell line) and K562 (a Chronic Myelogenous Leukemia cell line). In each cell line, we found several thousand instances of TEs donating coding sequences to human genes. We compared the transcriptome assembly of the RNA sequencing (RNA-Seq) reads with and without the aid of a reference transcriptome and found that the percentage of genes that incorporate TEs in their coding sequences is significantly greater than that obtained from the reference transcriptome assemblies using Refseq and Gencode gene models. We also used histone modifications chromatin immunoprecipitation sequencing (ChIP-Seq) data, Cap Analysis of Gene Expression (CAGE) data and DNAseI Hypersensitivity Site (DHS) data to demonstrate the epigenetic regulation of the TE derived coding sequences. Our results suggest that TEs form a significantly higher percentage of coding sequences than represented in gene annotation databases and these TE derived sequences are epigenetically regulated in accordance with their expression in the two cell types.