Machine learning can identify newly diagnosed patients with CLL at high risk of infection.
Rudi AgiusChristian BrieghelMichael Asger AndersenAlexander T PearsonBruno LedergerberAlessandro Cozzi-LepriYoram LouzounChristen L AndersenJacob BergstedtJakob H von StemannMette JørgensenMan-Hung Eric TangMagnus FontesJasmin BahloCarmen D HerlingMichael HallekJens Dilling LundgrenCameron Ross MacPhersonJan LarsenCarsten Utoft NiemannPublished in: Nature communications (2020)
Infections have become the major cause of morbidity and mortality among patients with chronic lymphocytic leukemia (CLL) due to immune dysfunction and cytotoxic CLL treatment. Yet, predictive models for infection are missing. In this work, we develop the CLL Treatment-Infection Model (CLL-TIM) that identifies patients at risk of infection or CLL treatment within 2 years of diagnosis as validated on both internal and external cohorts. CLL-TIM is an ensemble algorithm composed of 28 machine learning algorithms based on data from 4,149 patients with CLL. The model is capable of dealing with heterogeneous data, including the high rates of missing data to be expected in the real-world setting, with a precision of 72% and a recall of 75%. To address concerns regarding the use of complex machine learning algorithms in the clinic, for each patient with CLL, CLL-TIM provides explainable predictions through uncertainty estimates and personalized risk factors.