Precise Image-level Localization of Intracranial Hemorrhage on Head CT Scans with Deep Learning Models Trained on Study-level Labels.

Yunan Wu Michael Iorga Suvarna BadheJames ZhangDonald R CantrellElaine J Tanhehco Nicholas Szrama Andrew M Naidech Michael Drakopoulos Shamis T HasanKunal M PatelTarek A HijazEric J RussellShamal Lalvani Amit Adate Todd B ParrishAggelos K KatsaggelosVirginia B Hill

Published in: Radiology. Artificial intelligence (2024)

"Just Accepted" papers have undergone full peer review and have been accepted for publication in Radiology: Artificial Intelligence . This article will undergo copyediting, layout, and proof review before it is published in its final version. Please note that during production of the final copyedited article, errors may be discovered which could affect the content. Purpose To develop a highly generalizable weakly supervised model to automatically detect and localize image- level intracranial hemorrhage (ICH) using study-level labels. Materials and Methods In this retrospective study, the proposed model was pretrained on the image-level RSNA dataset and fine-tuned on a local dataset using attention-based bidirectional long-short-term memory networks. This local training dataset included 10,699 noncontrast head CT scans from 7469 patients with ICH study-level labels extracted from radiology reports. Model performance was compared with that of two senior neuroradiologists on 100 random test scans using the McNemar test, and its generalizability was evaluated on an external independent dataset. Results The model achieved a positive predictive value (PPV) of 85.7% (95% CI: [84.0%, 87.4%]) and an AUC of 0.96 (95% CI: [0.96, 0.97]) on the held-out local test set ( n = 7243, 3721 female) and 89.3% (95% CI: [87.8%, 90.7%]) and 0.96 (95% CI: [0.96, 0.97]), respectively, on the external test set ( n = 491, 178 female). For 100 randomly selected samples, the model achieved performance on par with two neuroradiologists, but with a significantly faster ( P < .05) diagnostic time of 5.04 seconds per scan (versus 86 seconds and 22.2 seconds for the two neuroradiologists, respectively). The model's attention weights and heatmaps visually aligned with neuroradiologists' interpretations. Conclusion The proposed model demonstrated high generalizability and high PPVs, offering a valuable tool for expedited ICH detection and prioritization while reducing false-positive interruptions in radiologists' workflows. ©RSNA, 2024.

Keyphrases