Coregulation mapping based on individual phenotypic variation in response to virus infection
Background: Gene coregulation across a population is an important aspect of the considerable variability of the human immune response to virus infection. Methodology to investigate it must rely on a number of ingredients ranging from gene clustering to transcription factor enrichment analysis. Results: We have developed a methodology to investigate the gene to gene correlations for the expression of 34 genes linked to the immune response of Newcastle Disease Virus (NDV) infected conventional dendritic cells (DCs) from 145 human donors. The levels of gene expression showed a large variation across individuals. We generated a map of gene co-expression using pairwise correlation and multidimensional scaling (MDS). The analysis of these data showed that among the 13 genes left after filtering for statistically significant variations, two clusters are formed. We investigated to what extent the observed correlation patterns can be explained by the sharing of transcription factors (TFs) controlling these genes. Our analysis showed that there was a significant positive correlation between MDS distances and TF sharing across all pairs of genes. We applied enrichment analysis to the TFs having binding sites in the promoter regions of those genes. This analysis, after Gene Ontology filtering, indicated the existence of two clusters of genes (CCL5, IFNA1, IFNA2, IFNB1) and (IKBKE, IL6, IRF7, MX1) that were transcriptionally co-regulated. In order to facilitate the use of our methodology by other researchers, we have also developed an interactive coregulation explorer web-based tool called CorEx. It permits the study of MDS and hierarchical clustering of data combined with TF enrichment analysis. We also offer web services that provide programmatic access to MDS, hierarchical clustering and TF enrichment analysis. Conclusions: MDS mapping based on correlation in conjunction with TF enrichment analysis represents a useful computational method to generate predictions underlying gene coregulation across a population.