Distributional Characteristics of Selected Chemical and Environmental Variables:Data from NHANES 2003-2004Ram B Jain*
Private Consultant, Dacula, USA
- *Corresponding Author:
- Ram B Jain
E-mail: [email protected]
Received date: January 25, 2017; Accepted date: February 18, 2017; Published date: February 25, 2017
Citation: Jain RB (2017) Distributional Characteristics of Selected Chemical and Environmental Variables: Data from NHANES 2003-2004. Epidemiology (Sunnyvale) 7:297. doi: 10.4172/2161-1165.1000297
Copyright: © 2017 Jain RB. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Objective: Log-transformations are commonly used to normalize chemical data. However, log-transformations do not always normalize the data. Thus, the objective of this study was to recursively use Tukey’s exploratory techniques to erect fences towards the data extremes until normality or near normality was achieved for the data lying within these fences.
Design: Data from National Health and Nutrition Examination Survey for the period 2003–2004 for 27 variables were used to conduct this study. Some of the 27 variables included for this study were: serum folate, serum transferrin receptor, urinary perchlorate, serum polychlorobiphenyl (PCB) 44, PCB-28, PCB-87, and PCB-52. Tukey’s exploratory techniques were recursively used to erect fences towards the data extremes until normality or near normality was achieved for the data lying within these fences. Following this, robust techniques were used to estimate statistical parameters for the reduced data lying within these fences. The statistical properties of the reduced data so obtained were evaluated and compared with the original log-transformed data.
Setting: Cross-sectional data from National Health and Nutrition Examination Survey (NHANES) for the period 2003–2004 for 27 variables.
Subjects: 1790 to 8363 depending up on the variable of interest who participated in NHANES 2003-2004.
Results: The use of non-normal data for statistical analysis can lead to under- or over- estimation of the measures of central tendency (means and geometric means) depending upon the comparative mix and magnitude of the observations that are identified as potential outliers and trimmed from the lower and upper tails of the original distributions to achieve normality. The standard deviations are always over-estimated and the widths of the confidence intervals around the means are over-estimated. Additional insights into the demographic characteristics of those which were trimmed from extreme tails can be very valuable.
Conclusion: To obtain correct estimates of descriptive data, it is worthwhile to temporarily trim certain percent data (probably, < 5%) to achieve normality or near normality. An evaluation of these trimmed data can provide insight into the characteristics for a given variable of the persons who have too low or too high concentrations of the chemicals of interest.