Potential and limitations of machine meta-learning (ensemble) methods for predicting COVID-19 mortality in a large inhospital Brazilian dataset.

Bruno Barbosa Miranda de Paiva Polianna Delfino Pereira Claudio Moisés Valiense de Andrade Virginia Mara Reis Gomes Maíra Viana Rego Souza E Silva Karina Paula Medeiros Prado Martins Thaís Lorenna Souza Sales Rafael Lima Rodrigues de Carvalho Magda Carvalho Pires Lucas Emanuel Ferreira Ramos Rafael Tavares Silva Alessandra de Freitas Martins Vieira Aline Gabrielle Sousa Nunes Alzira de Oliveira Jorge Amanda de Oliveira Maurílio Ana Luiza Bahia Alves Scotton Carla Thais Cândida Alves da Silva Christiane Correa Rodrigues Cimini Daniela Ponce Elayne Crestani Pereira Euler Roberto Fernandes Manenti Fernanda D Athayde Rodrigues Fernando Anschau Fernando Antonio Botoni Frederico Bartolazzi Genna Maira Santos Grizende Helena Carolina Noal Helena Duani Isabela Moraes Gomes Jamille Hemétrio Salles Martins Costa Júlia di Sabatino Santos Guimarães Julia Teixeira Tupinambás Juliana Rodrigues Machado Rúgolo Joanna d'Arc Lyra Batista Joice Coutinho de Alvarenga José Miguel Chatkin Karen Brasil Ruschel Liege Barella Zandoná Lílian Santos Pinheiro Luanna da Silva Monteiro MenezesLucas Moyses Carvalho de OliveiraLuciane Kopittke Luisa Argolo Assis Luiza Margoto Marques Magda César Raposo Maiara Anschau Floriani Maria Aparecida Camargos Bicalho Matheus Carvalho Alves Nogueira Neimy Ramos de Oliveira Patricia Klarmann Ziegelmann Pedro Gibson Paraiso Petrônio José de Lima Martelli Roberta Senger Rochele Mosmann Menezes Saionara Cristina Francisco Silvia Ferreira Araújo Tatiana Kurtz Tatiani Oliveira Fereguetti Thainara Conceição de Oliveira Yara Cristina Neves Marques Barbosa Ribeiro Yuri Carlotto Ramires Maria Clara Pontello Barbosa Lima Marcelo Carneiro Adriana Falangola Benjamin Bezerra Alexandre Vargas Schwarzbold André Soares de Moura Costa Bárbara Lopes Farace Daniel Vitório Silveira Evelin Paola de Almeida Cenci Fernanda Barbosa Lucas Fernando Graça Aranha Gisele Alsina Nader Bastos Giovanna Grunewald Vietta Guilherme Fagundes Nascimento Heloisa Reniers Vianna Henrique Cerqueira Guimarães Júlia Drumond Parreiras de Morais Leila Beltrami Moreira Leonardo Seixas de Oliveira Lucas de Deus Sousa Luciano de Souza Viana Máderson Alvares de Souza Cabral Maria Angélica Pires Ferreira Mariana Frizzo de Godoy Meire Pereira de Figueiredo Milton Henriques Guimarães-Júnior Mônica Aparecida de Paula de Sordi Natália da Cunha Severino Sampaio Pedro Ledic Assaf Raquel Lutkmeier Reginaldo Aparecido Valacio Renan Goulart Finger Rufino de Freitas Silva Silvana Mangeon Mereilles Guimarães Talita Fischer Oliveira Thulio Henrique Oliveira Diniz Marcos André Gonçalves Milena Soriano Marcolino

Published in: Scientific reports (2023)

The majority of early prediction scores and methods to predict COVID-19 mortality are bound by methodological flaws and technological limitations (e.g., the use of a single prediction model). Our aim is to provide a thorough comparative study that tackles those methodological issues, considering multiple techniques to build mortality prediction models, including modern machine learning (neural) algorithms and traditional statistical techniques, as well as meta-learning (ensemble) approaches. This study used a dataset from a multicenter cohort of 10,897 adult Brazilian COVID-19 patients, admitted from March/2020 to November/2021, including patients [median age 60 (interquartile range 48-71), 46% women]. We also proposed new original population-based meta-features that have not been devised in the literature. Stacking has shown to achieve the best results reported in the literature for the death prediction task, improving over previous state-of-the-art by more than 46% in Recall for predicting death, with AUROC 0.826 and MacroF1 of 65.4%. The newly proposed meta-features were highly discriminative of death, but fell short in producing large improvements in final prediction performance, demonstrating that we are possibly on the limits of the prediction capabilities that can be achieved with the current set of ML techniques and (meta-)features. Finally, we investigated how the trained models perform on different hospitals, showing that there are indeed large differences in classifier performance between different hospitals, further making the case that errors are produced by factors that cannot be modeled with the current predictors.