Using Information Content for Expanding Human Protein Coding Gene Interaction NetworksR Fechete1, A Heinzel1, J Söllner1, P Perco1, A Lukas1and B Mayer1,2*
- *Corresponding Author:
- Bernd Mayer
Emergentec Biodevelopment GmbH
Gersthofer Strasse 29-31
1180 Vienna, Austria
Tel: +43 1 4034966
Fax: +43 1 4034966-19
E-mail: [email protected]
Received date: March 04, 2013; Accepted date: April 04, 2013; Published date: April 08, 2013
Citation: Fechete R, Heinzel A, Söllner J, Perco P, Lukas A, et al. (2013) Using Information Content for Expanding Human Protein Coding Gene Interaction Networks. J Comput Sci Syst Biol 6:073-082. doi:10.4172/jcsb.1000102
Copyright: © 2013 Fechete R, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License,which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Molecular interaction networks have emerged as central analysis concept for Omics profile interpretation. This fact is driven by the need for improving hypothesis generation beyond the mere interpretation of molecular feature lists derived from statistical analysis of high throughput experiments. A number of human gene and protein interaction networks are available for such task, but these differ with respect to biological nature of interactions represented, and vary with respect to coverage of molecular feature space on the gene, transcript, protein and metabolite level. Naturally, both elements impose major impact on hypothesis generation. We here present a methodology for deriving expanded interaction networks via consolidating available interaction information and further adding computationally inferred interactions.
Integrating interaction data as provided in the public domain repositories IntAct, BioGrid and Reactome resulted in a core interaction network representing 11,162 human protein coding genes (out of a total of 19,980 protein coding genes) and 145,391 interactions. Utilizing annotation from ontologies on involvement in specific molecular pathways and function, combined with structural (domain) information as gene/protein node parameterization allowed computation of probabilities for additional interactions resting on the information content of individual sources. Utilizing topological information as degree centrality, global clustering coefficient and characteristic path length allowed defining a cutoff for interaction probabilities, resulting in an expanded interaction network holding 13,730 protein coding genes and 830,470 interactions. Evaluating such hybrid network against established interaction networks as KEGG showed significant recovery of evident interactions, indicating the validity of the expansion methodology.
Integrating available interaction data, further enlarged by inferred interactions, provided an expanded human interactome regarding both, number of represented molecular features as well as number of interactions, thereby promising improved Omics profile interpretation.