Benchmarking missing-values approaches for predictive models on health databases.
Alexandre PérezGael VaroquauxMarine Le MorvanJulie JosseJean-Baptiste PolinePublished in: GigaScience (2022)
Native support for missing values in supervised machine learning predicts better than state-of-the-art imputation with much less computational cost. When using imputation, it is important to add indicator columns expressing which values have been imputed.