Login / Signup

Patient Characteristics Impact Performance of AI Algorithm in Interpreting Negative Screening Digital Breast Tomosynthesis Studies.

Derek L NguyenYinhao RenTyler M JonesSamantha M ThomasJoseph Y LoLars J Grimm
Published in: Radiology (2024)
Background Artificial intelligence (AI) is increasingly used to manage radiologists' workloads. The impact of patient characteristics on AI performance has not been well studied. Purpose To understand the impact of patient characteristics (race and ethnicity, age, and breast density) on the performance of an AI algorithm interpreting negative screening digital breast tomosynthesis (DBT) examinations. Materials and Methods This retrospective cohort study identified negative screening DBT examinations from an academic institution from January 1, 2016, to December 31, 2019. All examinations had 2 years of follow-up without a diagnosis of atypia or breast malignancy and were therefore considered true negatives. A subset of unique patients was randomly selected to provide a broad distribution of race and ethnicity. DBT studies in this final cohort were interpreted by a U.S. Food and Drug Administration-approved AI algorithm, which generated case scores (malignancy certainty) and risk scores (1-year subsequent malignancy risk) for each mammogram. Positive examinations were classified based on vendor-provided thresholds for both scores. Multivariable logistic regression was used to understand relationships between the scores and patient characteristics. Results A total of 4855 patients (median age, 54 years [IQR, 46-63 years]) were included: 27% (1316 of 4855) White, 26% (1261 of 4855) Black, 28% (1351 of 4855) Asian, and 19% (927 of 4855) Hispanic patients. False-positive case scores were significantly more likely in Black patients (odds ratio [OR] = 1.5 [95% CI: 1.2, 1.8]) and less likely in Asian patients (OR = 0.7 [95% CI: 0.5, 0.9]) compared with White patients, and more likely in older patients (71-80 years; OR = 1.9 [95% CI: 1.5, 2.5]) and less likely in younger patients (41-50 years; OR = 0.6 [95% CI: 0.5, 0.7]) compared with patients aged 51-60 years. False-positive risk scores were more likely in Black patients (OR = 1.5 [95% CI: 1.0, 2.0]), patients aged 61-70 years (OR = 3.5 [95% CI: 2.4, 5.1]), and patients with extremely dense breasts (OR = 2.8 [95% CI: 1.3, 5.8]) compared with White patients, patients aged 51-60 years, and patients with fatty density breasts, respectively. Conclusion Patient characteristics influenced the case and risk scores of a Food and Drug Administration-approved AI algorithm analyzing negative screening DBT examinations. © RSNA, 2024.
Keyphrases
  • end stage renal disease
  • newly diagnosed
  • artificial intelligence
  • ejection fraction
  • chronic kidney disease
  • prognostic factors
  • peritoneal dialysis
  • machine learning
  • patient reported outcomes
  • deep learning