Login / Signup

On the Relation between Prediction and Imputation Accuracy under Missing Covariates.

Burim RamosajJustus TulowietzkiMarkus Pauly
Published in: Entropy (Basel, Switzerland) (2022)
Missing covariates in regression or classification problems can prohibit the direct use of advanced tools for further analysis. Recent research has realized an increasing trend towards the use of modern Machine-Learning algorithms for imputation. This originates from their capability of showing favorable prediction accuracy in different learning problems. In this work, we analyze through simulation the interaction between imputation accuracy and prediction accuracy in regression learning problems with missing covariates when Machine-Learning-based methods for both imputation and prediction are used. We see that even a slight decrease in imputation accuracy can seriously affect the prediction accuracy. In addition, we explore imputation performance when using statistical inference procedures in prediction settings, such as the coverage rates of (valid) prediction intervals. Our analysis is based on empirical datasets provided by the UCI Machine Learning repository and an extensive simulation study.
Keyphrases
  • machine learning
  • mental health
  • deep learning
  • healthcare
  • single cell
  • rna seq
  • affordable care act