A simple imputation algorithm reduced missing data in SF-12 health surveys
Abstract
Objective
The SF-12 Health Survey is a 12-item questionnaire that yields two summary scores (physical and mental health). Neither score can be computed when an item is missing. We explored imputation methods for missing scores for this instrument.
Study design and setting
Using data from a population-based survey, we tested several ways of imputing simulated missing data.
Results
Among 1250 participants, 118 (9.6%) had at last one missing SF-12 item. Missing data were more common among women, older respondents, non-Swiss nationals, and health service users. Among the 1132 respondents with complete data, replacement of any item with the mean population item weight yielded good results: the mean correlation between imputed and true score was 0.979 for both the physical and mental score. Results remained satisfactory when up to three of the six key items for each score (items that contribute predominantly to a given score), and any number of non-key items, were replaced by the mean. Application of this imputation algorithm to the original survey reduced the proportion of missing scores to <1%. Respondents with incomplete surveys, hence imputed scores, had lower scores than respondents with complete data (physical score: 44.9 vs. 49.8, p < 0.001, mental score: 44.4 vs. 46.3, p
=
0.064).
Conclusions
A simple imputation algorithm can substantially reduce the proportion of missing scores for the SF-12 health survey, and consequently reduce non-response bias.
Key words: Imputation methods, Non-response bias, Population surveys, Health status
To access this article, please choose from the options below
PII: S0895-4356(04)00194-5
doi:10.1016/j.jclinepi.2004.06.005
© 2005 Elsevier Inc. All rights reserved.
