Author(s): Ware J, Kosinski M, Keller SD
Regression methods were used to select and score 12 items from the Medical Outcomes Study 36-Item Short-Form Health Survey (SF-36) to reproduce the Physical Component Summary and Mental Component Summary scales in the general US population (n=2,333). The resulting 12-item short-form (SF-12) achieved multiple R squares of 0.911 and 0.918 in predictions of the SF-36 Physical Component Summary and SF-36 Mental Component Summary scores, respectively. Scoring algorithms from the general population used to score 12-item versions of the two components (Physical Components Summary and Mental Component Summary) achieved R squares of 0.905 with the SF-36 Physical Component Summary and 0.938 with SF-36 Mental Component Summary when cross-validated in the Medical Outcomes Study. Test-retest (2-week)correlations of 0.89 and 0.76 were observed for the 12-item Physical Component Summary and the 12-item Mental Component Summary, respectively, in the general US population (n=232). Twenty cross-sectional and longitudinal tests of empirical validity previously published for the 36-item short-form scales and summary measures were replicated for the 12-item Physical Component Summary and the 12-item Mental Component Summary, including comparisons between patient groups known to differ or to change in terms of the presence and seriousness of physical and mental conditions, acute symptoms, age and aging, self-reported 1-year changes in health, and recovery for depression. In 14 validity tests involving physical criteria, relative validity estimates for the 12-item Physical Component Summary ranged from 0.43 to 0.93 (median=0.67) in comparison with the best 36-item short-form scale. Relative validity estimates for the 12-item Mental Component Summary in 6 tests involving mental criteria ranged from 0.60 to 107 (median=0.97) in relation to the best 36-item short-form scale. Average scores for the 2 summary measures, and those for most scales in the 8-scale profile based on the 12-item short-form, closely mirrored those for the 36-item short-form, although standard errors were nearly always larger for the 12-item short-form.