Frequency and characteristics of errors by artificial intelligence (AI) in reading screening mammography: a systematic review.
Aileen ZengNehmat HoussamiNaomi NoguchiBrooke NickelM Luke MarinovichPublished in: Breast cancer research and treatment (2024)
AI errors are largely interpreted in the framework of test accuracy. FP and FN errors show expected variability not only by positivity threshold, but also by algorithm version and study quality. Reporting of other forms of AI errors is sparse, despite their potential implications for adoption of the technology. Considering broader types of AI error would add nuance to reporting that can inform inferences about AI's utility.