Tutorial: Handling missing data in R with MICE |
Multiple imputation (Rubin 1987, 1996) is the method of choice for complex incomplete data problems. Missing data that occur in more than one variable presents a special challenge. Two general approaches for imputing multivariate data have emerged: joint
modeling (JM) and fully conditional specification (FCS) (van Buuren 2007). Multivariate Imputation by Chained Equations (MICE) is the name of software for imputing incomplete multivariate data by FCS.
In this tutorial we present the R package mice v2.1, which extends the functionality of
mice v1.0 in several ways (van Buuren and Groothuis-Oudshoorn 2009). In the tutorial a hands-on, stepwise approach will be given to using
mice v2.1 for solving incomplete data problems in real data. The goal of the tutorial is to provide sound and practical imputation techniques to obtain appropriate statistical inferences from incomplete data.
The tutorial focuses on the specification of the imputation model, the most challenging step in multiple imputation. There is no magical setting that produces appropriate imputations in every problem. The tutorial will teach you how to go beyond the default
settings. In addition, we outline practical tools and techniques for analyzing the imputed data.