Is there a role for expectation maximization imputation in addressing missing data in research using WOMAC questionnaire? Comparison to the standard mean approach and a tutorial.
Academic Article
Overview
abstract
BACKGROUND: Standard mean imputation for missing values in the Western Ontario and Mc Master (WOMAC) Osteoarthritis Index limits the use of collected data and may lead to bias. Probability model-based imputation methods overcome such limitations but were never before applied to the WOMAC. In this study, we compare imputation results for the Expectation Maximization method (EM) and the mean imputation method for WOMAC in a cohort of total hip replacement patients. METHODS: WOMAC data on a consecutive cohort of 2,062 patients scheduled for surgery were analyzed. Rates of missing values in each of the WOMAC items from this large cohort were used to create missing patterns in the subset of patients with complete data. EM and the WOMAC's method of imputation are then applied to fill the missing values. Summary score statistics for both methods are then described through box-plot and contrasted with the complete case (CC) analysis and the true score (TS). This process is repeated using a smaller sample size of 200 randomly drawn patients with higher missing rate (5 times the rates of missing values observed in the 2,062 patients capped at 45%). RESULTS: Rate of missing values per item ranged from 2.9% to 14.5% and 1,339 patients had complete data. Probability model-based EM imputed a score for all subjects while WOMAC's imputation method did not. Mean subscale scores were very similar for both imputation methods and were similar to the true score; however, the EM method results were more consistent with the TS after simulation. This difference became more pronounced as the number of items in a subscale increased and the sample size decreased. CONCLUSIONS: The EM method provides a better alternative to the WOMAC imputation method. The EM method is more accurate and imputes data to create a complete data set. These features are very valuable for patient-reported outcomes research in which resources are limited and the WOMAC score is used in a multivariate analysis.