The Rockefeller University, USA
Dr. Yuval Itan holds a Ph.D. and M.Res. in Modeling Biological Complexity from University College London, the Complex program, and a B.Sc. in Computational Biology from Bar Ilan University, Israel. Since 2010 he is performing in-silico cutting edge research to identify variants that confer susceptibility to infectious diseases in high-throughput genomic and proteomic data, including the development of novel state-of-the-art methods for this purpose.
Infectious diseases have historically been the greatest killer of mankind. They still account for about 25 percent of all human mortality worldwide. It has become increasingly clear that human genetic background is a key determinant of infectious diseases. To determine the disease-causing allele(s), high-throughput genomic methods are applied and provide thousands of gene variants per patient. We recently developed a novel approach, the “human gene connectome” (HGC) – a concept, method and database that describes the set of all in silicopredicted biologically plausible routes and distances between all pairs of human genes, available for non-commercial users at: http://lab.rockefeller.edu/casanova/HGC/. With the HGC, we generated a “genespecific connectome” for each human gene – the set of all human genes ranked by their predicted biological proximity to the core gene of interest. We demonstrated that the HGC is currently the most powerful approach for prioritizing high-throughput genetic variants in Mendelian disease studies, by effectively identifying novel herpes simplex encephalitis (HSE) morbid alleles in whole exome sequencing (WES) data from patients. However, there is no available method for automating the detection of candidate disease-causing alleles at the cohort level and to test a pathwayindependent enrichment of gene sets, posing a major bottleneck in the field of high-throughput clinical genomics. Following the hypothesis that within a cohort of patients with the same Mendelian disease, the cluster that contains the true diseasecausing gene for each patient is the HGC-predicted biologically smallest cluster, we developed and applied a Mendelian clustering algorithm, which estimates the biologically smallest HGCpredicted cluster that contains one allele per patient. With this approach we approximated a solution for an NP-complete algorithmic problem (i.e. not possible to solve on a large scale by a computer), thereby estimating and statistically validating the disease causing alleles in a WES cohort of 86 HSE patients. We also developed a computer simulation based on the HGC-predicted biological distance that determines the statistical significance for any set of human genes being biologically clustered – closely related to each other regardless of direct connectivity, and therefore overcame the main limitation of all currently available gene enrichment methods. The described approaches should facilitate large-scale automation of disease causing alleles detection and high throughput genotype-phenotype correlation discoveries.