Explainable AI identifies diagnostic cells of genetic AML subtypes.
Matthias HehrArio SadafiChristian MatekPeter LienemannChristian PohlkampTorsten HaferlachKarsten SpiekermannCarsten MarrPublished in: PLOS digital health (2023)
Explainable AI is deemed essential for clinical applications as it allows rationalizing model predictions, helping to build trust between clinicians and automated decision support tools. We developed an inherently explainable AI model for the classification of acute myeloid leukemia subtypes from blood smears and found that high-attention cells identified by the model coincide with those labeled as diagnostically relevant by human experts. Based on over 80,000 single white blood cell images from digitized blood smears of 129 patients diagnosed with one of four WHO-defined genetic AML subtypes and 60 healthy controls, we trained SCEMILA, a single-cell based explainable multiple instance learning algorithm. SCEMILA could perfectly discriminate between AML patients and healthy controls and detected the APL subtype with an F1 score of 0.86±0.05 (mean±s.d., 5-fold cross-validation). Analyzing a novel multi-attention module, we confirmed that our algorithm focused with high concordance on the same AML-specific cells as human experts do. Applied to classify single cells, it is able to highlight subtype specific cells and deconvolve the composition of a patient's blood smear without the need of single-cell annotation of the training data. Our large AML genetic subtype dataset is publicly available, and an interactive online tool facilitates the exploration of data and predictions. SCEMILA enables a comparison of algorithmic and expert decision criteria and can present a detailed analysis of individual patient data, paving the way to deploy AI in the routine diagnostics for identifying hematopoietic neoplasms.
Keyphrases
- acute myeloid leukemia
- induced apoptosis
- cell cycle arrest
- single cell
- deep learning
- end stage renal disease
- machine learning
- endothelial cells
- newly diagnosed
- endoplasmic reticulum stress
- chronic kidney disease
- genome wide
- ejection fraction
- cell death
- big data
- bone marrow
- oxidative stress
- gene expression
- high throughput
- computed tomography
- healthcare
- prognostic factors
- allogeneic hematopoietic stem cell transplantation
- signaling pathway
- copy number
- clinical practice
- electronic health record
- pi k akt
- social media
- acute lymphoblastic leukemia
- health information
- induced pluripotent stem cells