Conducting Genome-Wide Association Studies: Epistasis Scenarios
Philip Cooley*, Nathan Gaddis, Ralph Folsom and Diane Wagener
RTI International, 3040 Cornwallis Road, P.O. Box 12194, Research Triangle Park, NC 27709, USA
- *Corresponding Author:
- Philip Cooley
RTI International, 3040 Cornwallis Road
P.O. Box 12194, Research Triangle Park
NC 27709, USA
E-mail: [email protected]
Received Date: June 21, 2012; Accepted Date: September 10, 2012; Published Date: September 12, 2012
Citation: Cooley P, Gaddis N, Folsom R, Wagener D (2012) Conducting Genome-Wide Association Studies: Epistasis Scenarios. J Proteomics Bioinform 5: 245-251. doi: 10.4172/jpb.1000244
Copyright: © 2012 Cooley P, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
This paper investigates epistatic scenarios in a genome-wide association studies (GWAS) context using a qualitative association model, to assess the statistical models that reliably predict associations between a qualitative phenotype (i.e., a disease diagnosis) and a pair of interacting genes. We employed the concept of relative risk, which is the ratio of the probability of a positive diagnosis given a mutated genotype divided by the probability with no risk present.
We used a Monte Carlo-based simulation approach, to generate synthetic data corresponding to a variety of possible epistatic models (EMs). Our method took into account the strength of association, disease prevalence in non-risk populations and most importantly, the inheritance patterns of the epistatic genes. We analyzed the simulated gene data, to assess how these individual factors influenced statistical power in the context of GWAS.
Using simulated data provides two distinct advantages. First, the association-affecting factors are isolated and can be linked to the affecting locus. Second, we can use any specific statistical method to perform the assessment. The simulated dataset provides a truth set, for assessing the effect of statistical method choice on association sensitivity, and highlights the role of errors in disease diagnosis and incorrect genotype assignments.
The results indicate that the most powerful statistical methods for predicting associations between phenotypes and genotypes, in epistatic scenarios are statistical models that simultaneously test for associations involving both interacting loci. This result is not surprising and has been reported by others. Two-gene models produce better predictions of association than single-gene models. The significance of this study is twofold: First, it incorporates recent new statistical methods as part of the comparison analysis and second, it documents the extent to which single-gene models fail to predict associations, involving interacting genes with phenotypes constructed to be associated with low risk.