Deep-learning system to improve the quality and efficiency of volumetric heart segmentation for breast cancer.
Roman ZeleznikJakob WeissJana TaronChristian GuthierDanielle S BittermanCindy HancoxBenjamin H KannDaniel W KimRinaa S PungliaJeremy BredfeldtBorek FoldynaParastou EslamiMichael T LuUdo HoffmannRaymond H MakHugo J W L AertsPublished in: NPJ digital medicine (2021)
Although artificial intelligence algorithms are often developed and applied for narrow tasks, their implementation in other medical settings could help to improve patient care. Here we assess whether a deep-learning system for volumetric heart segmentation on computed tomography (CT) scans developed in cardiovascular radiology can optimize treatment planning in radiation oncology. The system was trained using multi-center data (n = 858) with manual heart segmentations provided by cardiovascular radiologists. Validation of the system was performed in an independent real-world dataset of 5677 breast cancer patients treated with radiation therapy at the Dana-Farber/Brigham and Women's Cancer Center between 2008-2018. In a subset of 20 patients, the performance of the system was compared to eight radiation oncology experts by assessing segmentation time, agreement between experts, and accuracy with and without deep-learning assistance. To compare the performance to segmentations used in the clinic, concordance and failures (defined as Dice < 0.85) of the system were evaluated in the entire dataset. The system was successfully applied without retraining. With deep-learning assistance, segmentation time significantly decreased (4.0 min [IQR 3.1-5.0] vs. 2.0 min [IQR 1.3-3.5]; p < 0.001), and agreement increased (Dice 0.95 [IQR = 0.02]; vs. 0.97 [IQR = 0.02], p < 0.001). Expert accuracy was similar with and without deep-learning assistance (Dice 0.92 [IQR = 0.02] vs. 0.92 [IQR = 0.02]; p = 0.48), and not significantly different from deep-learning-only segmentations (Dice 0.92 [IQR = 0.02]; p ≥ 0.1). In comparison to real-world data, the system showed high concordance (Dice 0.89 [IQR = 0.06]) across 5677 patients and a significantly lower failure rate (p < 0.001). These results suggest that deep-learning algorithms can successfully be applied across medical specialties and improve clinical care beyond the original field of interest.
Keyphrases
- deep learning
- artificial intelligence
- big data
- convolutional neural network
- computed tomography
- machine learning
- healthcare
- end stage renal disease
- radiation therapy
- heart failure
- ejection fraction
- newly diagnosed
- magnetic resonance imaging
- prognostic factors
- peritoneal dialysis
- palliative care
- atrial fibrillation
- young adults
- electronic health record
- polycystic ovary syndrome
- pregnant women
- magnetic resonance
- contrast enhanced
- body composition
- clinical practice
- dual energy
- rectal cancer