Received date: June 07, 2014; Accepted date: August 20, 2014; Published date: August 25, 2014
Citation: Wu B, Li C, Xie J, Du Z, Luo L, et al. (2014) Bioinformatics Analyses of m-RNA Profiling Following Ezrin Knockdown in Esophageal Squamous Cell Carcinoma. J Cancer Sci Ther 6:314-321. doi:10.4172/1948-5956.1000287
Copyright: © 2014 Wu B, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Visit for more related articles at Journal of Cancer Science & Therapy
Ezrin is involves in multiple cancer cell functions. In this study, differentially expressed genes (DEGs) identified from mRNA expression profiling following Ezrin knockdown in esophageal squamous cell carcinoma (ESCC), were analyzed by multiple bioinformatics methods for a comprehensive understanding. Gene Ontology enrichment found significant terms related to immunity, stimulus response, extracellular matrix-binding and signal transduction. Functional Annotation Chart showed except GO categories, these DEGs were annotated by 72 functional category terms. Subpathway analyses revealed the DEGs were involved in 36 subpathways, overwhelming the traditional pathway enrichment. Promoter analyses showed specific sequence patterns and transcription factors correspond to the co-downregulation and co-upregulation of DEGs. MNB1A, DOF2, PBF, YY1, PRRX2, UBX and ARNTAHR co-regulate the downregulated genes, whereas GATA2, ZNF42, deltaEF1, MYCN and ARNT co-regulate the upregulated genes. This paper provides a workflow for a better understanding the roles of Ezrin knockdown in ESCC by the bioinformatics analyses of its DEGs.
Ezrin; Bioinformatics; Differentially expressed genes; Esophageal squamous cell carcinoma
DEG: Differentially Expressed Genes; ERM: Ezrin– Radixin–Moesin; ESCC: Esophageal Squamous Cell Carcinoma; GO: Gene Ontology; DAVID: Database for Annotation, Visualization and Integrated Discovery; KEGG: Kyoto Encyclopedia of Genes and Genomes; UCSC: University of California Santa Cruz
Ezrin, also named VIL2, is a member of the Ezrin–radixin– moesin (ERM) protein family. Ezrin acts as a linker between the actin cytoskeleton and plasma membrane proteins, as well as a signal transducer involving in cytoskeletal remodeling . Ezrin has been implicated in multiple aspects of cancer cell function, especially the metastasis of cancer cells [2-4]. Moreover, Ezrin translocates to the nucleus to regulate gene transcription in several cancer cell types [5- 7]. Our previous researches show Ezrin is over-expressed in human esophageal squamous cell carcinoma (ESCC), which is associated with the invasive phenotype of malignantly transformed esophageal epithelial cells . We subsequently found Ezrin translocates from the plasma membrane to the cytoplasm in the progression from normal epithelium to invasive carcinoma of esophageal epithelial cells .
The knockdown of Ezrin expression by RNA interference leads to suppression of growth, adhesion and invasiveness of ESCC cells in vitro, and tumorigenicity in nude mice also reveals that Ezrin regulates tumor formation in vivo . Furthermore, mRNA expression profiling was obtained using Affymetrix GeneChip Human genome U133 plus 2.0 and indicated that certain proliferation- and invasiveness-related genes are involved in Ezrin function . However, the biological meaning of this mRNA profile is not fully understood. For a comprehensive understanding of the roles of Ezrin, we re-analyzed this mRNA expression profile by multiple bioinformatic methods
Identification of differentially expressed genes
The mRNA expression profiling following Ezrin knockdown by RNAi in ESCC EC109 cell line is available from GEO database (GSE6233) at http://www.ncbi.nlm.nih.gov/geo/. The raw data is normalized and log transformed. The differentially expressed genes (DEGs) are identified based on a threshold of 2-fold change.
Gene ontology enrichment and functional annotation
Gene Ontology (GO) enrichment analysis was performed using the WebGestalt webserver, which contains information from Gene Ontology Tree Machine software (http://bioinfo.vanderbilt.edu/ webgestalt/) . The whole human genome gene set was set as a reference list. The hypergeometric statistical method test was used, and only statistically enriched terms (P < 0.05) with at least 2 genes were selected.
DAVID (http://david.abcc.ncifcrf.gov/) consists of an integrated biological knowledge base and analytic tools aimed at systematically extracting biological meaning from large gene/protein lists. Functional Annotation Chart in DAVID bioinformatics aims to identify overrepresented biological terms associated with a given gene list . Currently, Functional Annotation Chart provides over 40 category enrichments. Except for GO terms, it also includes protein functional domains, protein–protein interactions, sequence features, disease associations, pathways, homology, gene functional summaries and literature. The significantly enriched terms, from Functional Annotation Chart, with P < 0.05 were visualized by Enrichment Map plugin of Cytoscape .
KEGG pathway and subpathway analysis
The DEGs were enriched in the KEGG pathways by applying the bioconductor SubpathwayMiner package. Except for entire pathways, Subpathway Miner could also detect the local pathway structure (subpathway), which helps to understand the detailed information of the genes of interest . This package adopts the k-clique concept to define a subpathway. All possible subpathways were obtained by searching all possible paths between start-points and end-points in the adjacency matrix generated by node relationships in an entire KEGG pathway. The k was set to 4 in this study, which means that the distance among all genes within the subpathways was no greater than 4.
Promoter sequence patterns and potential transcription factors of DEGs
The 2000 bp promoter sequence of the top 20 most changed downregulated and upregulated genes were downloaded from the UCSC genome database (http://genome.ucsc.edu), respectively. Promoter sequence patterns over-represented or down-represented in these two sequence sets were analyzed by the POCO program (http://ekhidna. biocenter.helsinki.fi/poxo/poco/poco). POCO is tool to find distinctly represented regulatory patterns from either one or two sequence sets. The background organism was set as homo_sapiens_clean, and the longest pattern length was set at 8 in this study. Subsequently, significant sequence patterns were screened in the JASPAR database to identify recognized transcription factors (similarity index >0.70) .
GO enrichment and functional annotation
We obtained 244 DEGs in total, including 199 upregulated genes and 45 downregulated genes, using a 2-fold change as the threshold (Supplementary File 1). To understand their functional categories, GO enrichment analysis was performed on these DEGs using hypergeometric test and multiple test adjustment method BH through WebGestalt (Figure 1). The result is showed in the tree view to illustrate the enriched GO structure. To our surprise, we found twelve Biological process terms related to immunity and response to stimuli, such as “response to stress” (65 genes, P= 3.20E-05), “response to biotic stimulus” (18 genes, P= 6.50E-03), and “defense response to virus” (12 genes, P= 4.00e-04). These are indicated in the left portion of the result graph (Figure 1). Ezrin interacts with both the plasma membrane and filamentous actin. GO terms associated with cytoskeleton organization were found. In the Cellular component enrichment, the term “extracellular matrix-binding” contains four genes (CYR61, DCN, SMOC1 and OLFML2B) with significantly altered expression (P= 4.62e-02), suggesting these 4 genes participate in extracellular matrix re-modeling. Moreover, Ezrin might affect signal transduction between cells. As many as 70 genes are enriched in “Signal transduction”, suggesting intracellular signal transduction was significantly impacted by Ezrin knockdown. Dozens of the genes regulating cell growth and cell death are also enriched.
Figure 1: GO enrichment of differentially expressed genes from the mRNA expression profile of Ezrin knockdown was performed by the WebGestalt webserver. The enriched GO terms are visualized using a graphical representation with red color coding reflecting the significant enrichment, while those in black are non-enriched parents. Listed in the boxes is the name of the GO category, the number of genes in the category and the P-value indicating the significance of enrichment.
The DEGs were also clustered by Functional Annotation Chart in DAVID bioinformatics, and the enrichment result was visualized by the Enrichment Map in Cytoscape (Figure 2). Each functional category is represented by a node, and the node size represents the number of enriched genes (Figure 2). Nodes of the same functional category are indicated by the same shape. The edge between nodes is established as overlapping genes exist between these two nodes, and the width is defined as the overlap coefficient between these two categories. In total, 158 Functional Annotation Chart enrichments are identified, excluding 86 terms from three GO categories, but including 72 terms from the following annotation categories, 17 INTERPRO, 7 SMART, 17 SP_PIR_KEYWORDS, 23 UP_SEQ_FEATURE, 4 PIR_ SUPERFAMILY, 1 COG_ONTOLOGY and 3 KEGG_PATHWAY. In the category of SP_PIR_KEYWORDS, as many as 52 genes are enriched in the term of “Signal”, which suggests an important signaling transduction role for Ezrin. Ezrin knockdown greatly influenced the activities of other signaling molecules, with nine genes (MAFF, FOS, FOSL2, ATF3, HIST1H2BC, HIST1H2AG, POU5F1, PRDM1 and TNFAIP3) are enriched in the “DNA binding” term in the category of SP_PIR_KEYWORDS. On the other hand, the “DNA-binding region:Basic motif” term, in the category of UP_SEQ_FEATURE, contains 8 genes (MAFF, FOS, FOSL2, ATF3, SREBF1, MGA, NFIL3 and MXI1). A total of 13 unique genes, characterized by DNA-binding, appear to be critical downstream effectors of Ezrin, suggesting Ezrin exerts a large biological impact through gene transcription regulation. The top enrichment term in KEGG-PATHWAY, is “Pathways in cancer”, which consists of 10 genes (FOS, CDKN1A, FZD10, BMP2, PTGS2, CASP9, PDGFA, ITGAV, EGLN3 and KITLG). These chart results would provide an overview of the biological impact of Ezrin knockdown in ESCC.
Figure 2: Visualization of results from Functional Annotation Chart for the differentially expressed genes using the Enrichment map of Cytoscape. Each node represents a significant functional term with its size corresponds to number of genes. Edges indicate gene overlap between nodes, where the thickness indicates the size of the overlap.
Pathway and subpathway enrichments
To identify how the downstream genes from Ezrin knockdown influence cell signal transduction, the DEGs were mapped to KEGG pathways by traditional entire KEGG pathway enrichment. DEGs are found enriched only in 10 entire pathways, of which only two pathways (Pathways in cancer and p53 signaling pathway) are related to carcinoma progression (Table 1). A subpathway is limited to local area of an entire pathway, which helps to understand how the interested genes affect the pathway locally. Using SubpathwayMiner analysis, our DEGs are found significantly enriched in 36 subpathways that corresponded to 21 entire pathways (Table 2). Two subpathways that derived from the entire Pathways in caner (path: 05200) are showed, path: 05200_4 involved 5 genes (CDKN1A, CASP9, PDGFA, ITGAV and KITLG), while path: 05200_20 contained 3 genes (CASP9, CDKN1A and FZD10) (Figure 3). Except for the 10 entire pathways found by the traditional pathway enrichment, dozens of subpathways derived from many other entire pathways are found which were not detected by traditional KEGG pathway enrichment because of the statistical threshold (Table 2).
Figure 3: Enriched differentially expressed gene subpathways from Pathway in cancer (path:05200). The two significant subpathways, path: 05200_4 and 05200_20, are indicated by the blue and yellow shapes, respectively. The subpathway structures of path: 05200_4 and path: 05200_20 generated by Subpathway package are also shown, with the differentially expressed genes indicated by a red frame.
|path:05014||Amyotrophic lateral sclerosis (ALS)||4/287||0.0052|
|path:00051||Fructose and mannose metabolism||3/287||0.0100|
|path:05200||Pathways in cancer||10/287||0.0107|
|path:04115||p53 signaling pathway||4/287||0.0118|
|path:04710||Circadian rhythm - mammal||2/287||0.0336|
|path:00030||Pentose phosphate pathway||2/287||0.0489|
#: the first number refers to the annotated DEGs in the pathway; the second number is total number of pathways.
Table 1: Enriched KEGG pathways of the DEGs
|Entire pathway ID||Entire pathway Name||subpathway ID||p-value|
|path:00030||Pentose phosphate pathway||path:00030_3||0.0489|
|path:00051||Fructose and mannose metabolism||path:00051_1||0.0058|
|path:04060||Cytokine-cytokine receptor interaction||path:04060_39||0.0262|
|path:04115||p53 signaling pathway||path:04115_1||0.0484|
|path:04610||Complement and coagulation cascades||path:04610_5||0.0489|
|path:04620||Toll-like receptor signaling pathway||path:04620_15||0.0489|
|path:04622||RIG-I-like receptor signaling pathway||path:04622_1||0.0033|
|path:04672||Intestinal immune network for IgA production||path:04672_8||0.0262|
|path:04710||Circadian rhythm - mammal||path:04710_1||0.0308|
|path:04912||GnRH signaling pathway||path:04912_5||0.0073|
|path:05014||Amyotrophic lateral sclerosis (ALS)||path:05014_2||0.0230|
|path:05200||Pathways in cancer||path:05200_20||0.0178|
|path:05222||Small cell lung cancer||path:05222_6||0.0390|
Table 2: Enriched KEGG subpathways of the DEGs.
Subpathways, such as Cytokine-cytokine receptor interaction and ECM-receptor interaction, are expected based on existing knowledge of Ezrin biology. Interestingly, several metabolism-related subpathways are unexpectedly over-represented in this enrichment analysis, for example, pathways of “Glycolysis/Gluconeogenesis”, “Pentose phosphate pathway”, “Fructose and mannose metabolism”, “Purine metabolism” and “Nitrogen metabolism”. These pathways are not widely acknowledged, but are highly over-represented in our analysis. Of these five pathways, “Pentose phosphate pathway” and “Purine metabolism” are discovered by subpathway analysis.
Promoter sequence patterns in upregulated and downregulated genes
We assumed that in the promoters of upregulated and downregulated genes, there would be certain sequence patterns corresponding to transcriptional regulation for gene co-expression following Ezrin knockdown. POCO was applied to find overrepresented and under-represented regulatory patterns between the upregulated and downregulated gene promoter sequence sets. We found 68 significant sequence patterns that are over-represented in the downregulated genes, but under-represented in the upregulated genes, while 64 significant sequence patterns are over-represented in the upregulated genes. The top 20 patterns presented in downregulated or upregulated genes are shown in Tables 3 and 4, respectively. All these patterns appeared in all 20 most downregulated or upregulated genes, respectively.
|PATTERN||OCC1 (#PRO / #TOT)||OCC2 (#PRO / #TOT)||FSCORE||p-value|
Note: 1. OCC: total number of patterns within the sequences of the corresponding cluster (OCC1 represents the downregulated DEG promoter sequence set; OCC2 represents the upregulated DEG promoter sequence set) 2. PRO: total number of sequences with the pattern in the corresponding cluster 3. TOT: total number of sequences in the corresponding cluster 4. FSCORE: results of the ANOVA between the two input clusters and the background sequence set)
Table 3: Sequence patterns over-represented in the downregulated genes, but under-represented in the upregulated genes.
|PATTERN||OCC1 (#PRO / #TOT)||OCC2 (#PRO / #TOT)||FSCORE||p-value|
Note: 1. OCC: total number of patterns within the sequences of the corresponding cluster (OCC1 represents the downregulated DEG promoter sequence set; OCC2 represents the upregulated DEG promoter sequence set) 2. PRO: total number of sequences with the pattern in the corresponding cluster 3. TOT: total number of sequences in the corresponding cluster 4. FSCORE: result of the ANOVA between the two input clusters and the background sequence set)
Table 4:Sequence patterns over-represented in the upregulated genes, but under-represented in the downregulation genes.
Subsequently, these significant patterns were then screened using the JASPAR database to identify potential transcription factors (TFs). Five unique patterns, corresponding to seven transcription factors (MNB1A, DOF2, PBF, YY1, PRRX2, UBX and ARNT-AHR), were found in the downregulated gene set (Figure 4). On the other hand, two unique patterns corresponding to five transcription factors (GATA2, ZNF42, deltaEF1, MYCN and ARNT) were found in the upregulated gene set (Figure 4). These results suggest these transcription factors might be responsible for the transcriptional regulation of the downregulated and upregulated DEGs.
Figure 4: Motif logos of the predicted transcription factors. Potential transcription factors were identified by sequence patterns from searching the JASPAR database. Potential transcription factors regulating downregulated genes are labeled in blue, while transcription factors regulating downregulated genes are labeled in red.
Esophageal cancer is the eighth most common cancer and the sixth most common cause of cancer deaths worldwide, with esophageal squamous cell carcinoma (ESCC) being the dominant type in East Asia . Previous research indicates Ezrin plays critical roles in the invasion and metastasis of ESCC [8-10]. In this study, we present a comprehensive analysis of mRNA profile following Ezrin knockdown in ESCC EC109 cell line by using various bioinformatic methods to reveal the potential mechanisms of Ezrin action.
The 244 DEGs were initially analyzed using GO enrichment tools. Ezrin protein interacts with both the plasma membrane and filamentous actin. Its change might cause the rearrangement of cytoskeleton, which might explain that many significant Biological process terms related to immunity and response to stimulus are found. Cellular component enrichment identified four genes involved in extracellular matrix binding (CYR61, DCN, SMOC1 and OLFML2B). These genes have been reported to play important roles in cancer cell migration and metastasis. CYR61 is a potential target downstream of the TGF-β/p38 MAPK pathway to regulate cell migration that plays a key role in the metastasis of osteosarcoma cells [17,18]. Decorin (DCN) is a candidate biomarker in primary breast cancer patients and plays important roles in the regulation of the tumor microenvironment and pathways related to tumorigenesis . On the other hand, as many as 70 signal transduction genes and other genes regulating cell growth and cell death are also enriched, suggesting the intracellular signal transduction, growth, and death are significantly impacted after Ezrin knockdown. These results are consistent with our previous report .
To understand the functional classification of these DEGs to a greater extern, the DEGs were also clustered by Functional Annotation Chart and presented by visualization. Fifty-two genes are enriched in the term of “Signal”, which is consistent with the GO enrichment result. Several findings indicate that Ezrin alters gene transcription. The enriched terms of “DNA binding” and “DNA-binding region: Basic motif” containing 13 unique genes indicate that Ezrin regulates gene transcription following the alteration of signal transduction. Of these, adenylate kinase-4 is a marker for poor clinical outcome and promotes metastasis in lung cancer by downregulating the transcription factor ATF3 . Though Ezrin is a membrane-actin cross-linking protein, its translocation to cytosol has been found in multiple cancers, which is considered to be involved in cell motility, invasion, and carcinoma metastasis [21-23]. These relevant findings strongly indicate the ability of Ezrin in the regulation of gene expression. Consistent with these findings, our microarray results also suggest the great impact of Ezrin on the gene expression regulation by the regulatory cascade from the signal transduction to DNA binding. These chart results would provide an overview of the biological impact of Ezrin knockdown in ESCC.
The subpathway mining algorithm in SubpathwayMiner has been successful in identifying disease-related subpathways of biological significance [24,25]. One of the advantages of the subpathway method is that it identifies more pathways than the traditional pathway enrichment. Using SubpathwayMiner, we found several metabolismrelated subpathways for the DEGs in this study, supporting recent findings about the relationship between Ezrin and cellular metabolism . Ren et al. found differentially expressed genes that obtained from mutant Ezrin overexpression in osteosarcoma were functionally linked to carbohydrate and amino acid metabolism. In particular, cells expressing closed, inactive Ezrin exhibited reduced lactate production and basal or ATP-dependent oxygen consumption . It suggests that the correlation between Ezrin and metabolism might be a common phenomenon in different cell type of cancers. The second advantage of the subpathway method is it provides more detailed knowledge about the local structure of interested pathways. In this study, two subpathways from Pathway in Cancer indicate the information flow of DEGs in the pathway. The relationship between Ezrin and DEGs in the subpathways shown in Figure 3 is interesting. Ezrin is novel interacting protein of CASP9 in lung cancer . PDGF is able to induce phosphorylation of Ezrin protein .
To find biologically significant putative regulatory patterns for the co-expressed DEGs from Ezrin knockdown, the promoters of upregulated and downregulated DEGs were analyzed using the POCO program to find distinguishing sequence patterns and potential transcription factors. The results show MNB1A, DOF2, PBF, YY1, PRRX2, UBX and ARNT-AHR are unique for the downregulated DEGs promoters, while GATA2, ZNF42, deltaEF1, MYCN and ARNT are unique for the upregulated DEG promoters. The transcription factor Yin Yang (YY) 1 is overexpressed in most cancers, regulating multiple cancer-related proteins and signaling pathways, and overexpression of YY1 both at mRNA and protein levels in ESCC tissue has been confirmed by Luo et al. [29-31]. For other transcription factors that might respond to the upregulated DEGs, overexpression of Snail occurs in ESCC and correlates positively with lymphovascular invasion and was associated with worse overall survival . GATA factors are kind of zinc finger DNA binding proteins, regulating the specification and differentiation of numerous tissues by activating or repressing transcription . GATA2 promotes breast cancer cell growth by directly repressing transcription of PTEN tumor suppressor gene . DeltaEF1, also called ZEB1, is focally expressed in invasive ESCC . These suggests these transcription factors are not only critical in the regulation of upregulated and downregulated DEGs by Ezrin knockdown, but might participate in the invasion or metastasis mediated by Ezrin.
In summary, a comprehensive understanding the molecular mechanisms underlying Ezrin in ESCC after its knockdown was obtained by bioinformatics analyses, especially by the means of subpathway and sequence patterns for co-expression, which provide more clues than the traditional methods. These analytical methods are suited for other kinds of high-throughput data that are applicable to search novel functional genes and pathways related to gene(s) of interest.
This work was supported by grants from the NSFC-GuangDong Joint Fund (U0932001), the National Basic Research Program (2012CB526608), the National High Technology Research and Development Program of China (2012AA02A503 and 2012AA02A209), the National Science Foundation of China (30900560), the Foundation for Distinguished Young Talents in Higher Education of Guangdong (LYM09081) and Shantou University Medical Research Fund. We thank Dr. Stanley Lin, Department of Pathophysiology, The Key Immunopathology Laboratory of Guangdong Province, Shantou University Medical College, for the assistance in revising the manuscript.