Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies
2Division of Clinical and Translational Sciences, Department of Internal Medicine, University of Texas Medical School, Biostatistics/Epidemiology/Research Design (BERD) Core, Center for Clinical and Translational Sciences (CCTS), The University of Texas Health Science Center at Houston, Houston, TX
3Division of Epidemiology, Human Genetics and Environmental Sciences, University of Texas School of Public Health, Division of Clinical and Translational Sciences, Department of Internal Medicine, University of Texas Medical School at Houston, and Center for Clinical and Translational Sciences at The University of Texas Health Science Center at Houston, Houston, TX
- Corresponding Author:
- Mohammad H. Rahbar
Professor of Epidemiology Biostatistics
and Clinical & Translational Sciences
Center for Clinical and Translational Sciences
The University of Texas Health Science Center at Houston
UT Professional Building, Houston, TX
E-mail: [email protected]
Received Date: February 15, 2016; Accepted Date: February 29, 2016; Published March 07, 2016
Citation: Vatcheva KP, Lee M, McCormick JB, Rahbar MH (2016) Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies. Epidemiol 6:227.doi:10.4172/2161-1165.1000227
Copyright: © 2016 Vatcheva KP, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.