Lessons Learned in Dealing with Missing Race Data: An Empirical
Investigation

Mulugeta Gebregziabher; Yumin  Zhao; Neal  Axon; Gregory  E. Gilbert; Carrae  Echols; Leonard  E. Egede

doi:10.4172/2155-6180.1000138

Lessons Learned in Dealing with Missing Race Data: An Empirical Investigation

Abstract

Mulugeta Gebregziabher, Yumin Zhao, Neal Axon, Gregory E. Gilbert, Carrae Echols and Leonard E. Egede

Abstract Background: Missing race data is a ubiquitous problem in studies using data from large administrative datasets such as the Veteran Health Administration and other sources. The most common approach to deal with this problem has been analyzing only those records with complete data, Complete Case Analysis (CCA) which requires the assumption of Missing Completely At Random (MCAR) but CCA could lead to biased estimates with inflated standard errors. Objective: To examine the performance of a new imputation approach, Latent Class Multiple Imputation (LCMI), for imputing missing race data and make comparisons with CCA, Multiple Imputation (MI) and Log-Linear Multiple Imputation (LLMI). Design/Participants: To empirically compare LCMI to CCA, MI and LLMI using simulated data and demonstrate their applications using data from a sample of 13,705 veterans with type 2 diabetes among whom 23% had unknown/ missing race information. Results: Our simulation study shows that under MAR, LCMI leads to lower bias and lower standard error estimates compared to CCA, MI and LLMI. Similarly, in our data example which does not conform to MCAR since subjects with missing race information had lower rates of medical comorbidities than those with race information, LCMI outperformed MI and LLMI providing lower standard errors especially when relatively larger number of latent classes is assumed for the latent class imputation model. Conclusions: Our results show that LCMI is a valid statistical technique for imputing missing categorical covariate data and particularly missing race data that offers advantages with respect to precision of estimates.

PDF

Share this article

Awards & Nominations

50+ Million Readerbase

Journal Highlights

Google Scholar citation report

Citations: 3254

Journal of Biometrics & Biostatistics received 3254 citations as per Google Scholar report

Journal of Biometrics & Biostatistics peer review process verified at publons

Indexed In

Index Copernicus
Google Scholar
Sherpa Romeo
Academic Journals Database
Open J Gate
Genamics JournalSeek
Academic Keys
JournalTOCs
ResearchBible
China National Knowledge Infrastructure (CNKI)
Ulrich's Periodicals Directory
Access to Global Online Research in Agriculture (AGORA)
Electronic Journals Library
RefSeek
Hamdard University
EBSCO A-Z
Directory of Abstract Indexing for Journals
OCLC- WorldCat
SWB online catalog
Virtual Library of Biology (vifabio)
Publons
Euro Pub

Journal of Biometrics & Biostatistics