Predictive models in emergency medicine and their missing data strategies: a systematic review.
Emilien ArnaudMahmoud ElbattahChristine AmmiratiGilles DequenDaniel Aiham GhazaliPublished in: NPJ digital medicine (2023)
In the field of emergency medicine (EM), the use of decision support tools based on artificial intelligence has increased markedly in recent years. In some cases, data are omitted deliberately and thus constitute "data not purposely collected" (DNPC). This accepted information bias can be managed in various ways: dropping patients with missing data, imputing with the mean, or using automatic techniques (e.g., machine learning) to handle or impute the data. Here, we systematically reviewed the methods used to handle missing data in EM research. A systematic review was performed after searching PubMed with the query "(emergency medicine OR emergency service) AND (artificial intelligence OR machine learning)". Seventy-two studies were included in the review. The trained models variously predicted diagnosis in 25 (35%) publications, mortality in 21 (29%) publications, and probability of admission in 21 (29%) publications. Eight publications (11%) predicted two outcomes. Only 15 (21%) publications described their missing data. DNPC constitute the "missing data" in EM machine learning studies. Although DNPC have been described more rigorously since 2020, the descriptions in the literature are not exhaustive, systematic or homogeneous. Imputation appears to be the best strategy but requires more time and computational resources. To increase the quality and the comparability of studies, we recommend inclusion of the TRIPOD checklist in each new publication, summarizing the machine learning process in an explicit methodological diagram, and always publishing the area under the receiver operating characteristics curve-even when it is not the primary outcome.