Predicting Type 1 Diabetes Candidate Genes using Human Protein-Protein Interaction NetworksShouguo Gao, Xujing Wang*
Department of Physics & the Comprehensive Diabetes Center, University of Alabama at Birmingham, 1300 University Blvd, Birmingham, AL 35294, USA
- *Corresponding Author:
- Dr. Xujing Wang
Department of Physics & the Comprehensive Diabetes Center
University of Alabama at Birmingham
1300 University Blvd, Birmingham
AL 35294, USA,
E-mail: [email protected]
Received date: February 27, 2009; Accepted date: March 30, 2009; Published date: April 01, 2009
Citation: Gao S, Wang X (2009) Predicting Type 1 Diabetes Candidate Genes using Human Protein-Protein Interaction Networks. J Comput Sci Syst Biol 2:133-146. doi:10.4172/jcsb.1000025
Copyright: © 2009 Gao S, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Background Proteins directly interacting with each other tend to have similar functions and be involved in the same cellular processes. Mutations in genes that code for them often lead to the same family of disease phenotypes. Efforts have been made to prioritize positional candidate genes for complex diseases utilize the protein-protein interaction (PPI) information. But such an approach is often considered too general to be practically useful for specific diseases. Results In this study we investigate the efficacy of this approach in type 1 diabetes (T1D). 266 known disease genes, and 983 positional candidate genes from the 18 established linkage loci of T1D, are compiled from the T1Dbase (https://t1dbase.org). We found that the PPI network of known T1D genes has distinct topological features from others, with significantly higher number of interactions among themselves even after adjusting for their high network degrees (p<1e-5). We then define those positional candidates that are first degree PPI neighbours of the 266 known disease genes to be new candidate disease genes. This leads to a list of 68 genes for further study. Cross validation using the known disease genes as benchmark reveals that the enrichment is ~17.1 fold over random selection, and ~4 fold better than using the linkage information alone. We find that the citations of the new candidates in T1D-related publications are significantly (p<1e-7) more than random, even after excluding the co-citation with the known disease genes; they are significantly over-represented (p<1e-10) in the top 30 GO terms shared by known disease genes. Furthermore, sequence analysis reveals that they contain significantly (p<0.0004) more protein domains that are known to be relevant to T1D. These findings provide indirect validation of the newly predicted candidates. Conclusion Our study demonstrates the potential of the PPI information in prioritizing positional candidate genes for T1D.