Login / Signup

Practical strategies for handling breakdown of multiple imputation procedures.

Cattram Duong NguyenJohn B CarlinKatherine J Lee
Published in: Emerging themes in epidemiology (2021)
Multiple imputation is a recommended method for handling incomplete data problems. One of the barriers to its successful use is the breakdown of the multiple imputation procedure, often due to numerical problems with the algorithms used within the imputation process. These problems frequently occur when imputation models contain large numbers of variables, especially with the popular approach of multivariate imputation by chained equations. This paper describes common causes of failure of the imputation procedure including perfect prediction and collinearity, focusing on issues when using Stata software. We outline a number of strategies for addressing these issues, including imputation of composite variables instead of individual components, introducing prior information and changing the form of the imputation model. These strategies are illustrated using a case study based on data from the Longitudinal Study of Australian Children.
Keyphrases
  • mental health
  • minimally invasive
  • machine learning
  • healthcare
  • electronic health record
  • big data
  • young adults
  • data analysis
  • deep learning
  • artificial intelligence