Login / Signup

Performance of an Artificial Intelligence System for Breast Cancer Detection on Screening Mammograms from BreastScreen Norway.

Marthe LarsenCamilla F AglenChristoph I LeeTone HovdaSolveig Roth HoffMarit A MartiniussenKarl Øyvind MikalsenHåkon Lund-HanssenHelene S SolliMarko SilberhornÅse Ø SulheimSteinar AuensenJan F NygårdSolveig Hofvind
Published in: Radiology. Artificial intelligence (2024)
Purpose To explore the stand-alone breast cancer detection performance, at different risk score thresholds, of a commercially available artificial intelligence (AI) system. Materials and Methods This retrospective study included information from 661 695 digital mammographic examinations performed among 242 629 female individuals screened as a part of BreastScreen Norway, 2004-2018. The study sample included 3807 screen-detected cancers and 1110 interval breast cancers. A continuous examination-level risk score by the AI system was used to measure performance as the area under the receiver operating characteristic curve (AUC) with 95% CIs and cancer detection at different AI risk score thresholds. Results The AUC of the AI system was 0.93 (95% CI: 0.92, 0.93) for screen-detected cancers and interval breast cancers combined and 0.97 (95% CI: 0.97, 0.97) for screen-detected cancers. In a setting where 10% of the examinations with the highest AI risk scores were defined as positive and 90% with the lowest scores as negative, 92.0% (3502 of 3807) of the screen-detected cancers and 44.6% (495 of 1110) of the interval breast cancers were identified with AI. In this scenario, 68.5% (10 987 of 16 040) of false-positive screening results (negative recall assessment) were considered negative by AI. When 50% was used as the cutoff, 99.3% (3781 of 3807) of the screen-detected cancers and 85.2% (946 of 1110) of the interval breast cancers were identified as positive by AI, whereas 17.0% (2725 of 16 040) of the false-positive results were considered negative. Conclusion The AI system showed high performance in detecting breast cancers within 2 years of screening mammography and a potential for use to triage low-risk mammograms to reduce radiologist workload. Keywords: Mammography, Breast, Screening, Convolutional Neural Network (CNN), Deep Learning Algorithms Supplemental material is available for this article . © RSNA, 2024 See also commentary by Bahl and Do in this issue.
Keyphrases