Validation of an Automated System for the Extraction of a Wide Dataset for Clinical Studies Aimed at Improving the Early Diagnosis of Candidemia.

Daniele Roberto Giacobbe Sara Mora Alessio SignoriChiara RussoGiorgia BrucciCristina CampiSabrina GuastavinoCristina Marelli Alessandro Limongelli Patricia Muñoz Malgorzata MikulskaAnna MarcheseAntonio Di Biagio Mauro GiacominiMatteo Bassetti

Published in: Diagnostics (Basel, Switzerland) (2023)

There is increasing interest in assessing whether machine learning (ML) techniques could further improve the early diagnosis of candidemia among patients with a consistent clinical picture. The objective of the present study is to validate the accuracy of a system for the automated extraction from a hospital laboratory software of a large number of features from candidemia and/or bacteremia episodes as the first phase of the AUTO-CAND project. The manual validation was performed on a representative and randomly extracted subset of episodes of candidemia and/or bacteremia. The manual validation of the random extraction of 381 episodes of candidemia and/or bacteremia, with automated organization in structured features of laboratory and microbiological data resulted in ≥99% correct extractions (with confidence interval < ±1%) for all variables. The final automatically extracted dataset consisted of 1338 episodes of candidemia (8%), 14,112 episodes of bacteremia (90%), and 302 episodes of mixed candidemia/bacteremia (2%). The final dataset will serve to assess the performance of different ML models for the early diagnosis of candidemia in the second phase of the AUTO-CAND project.

Keyphrases