Balancing Performance and Interpretability in Medical Image Analysis: Case study of Osteopenia.

Mateo MikulićDominik VičevićEszter Nagy Mateja Napravnik Ivan Štajduhar Sebastian Tschauner Franko Hržić

Published in: Journal of imaging informatics in medicine (2024)

Multiple studies within the medical field have highlighted the remarkable effectiveness of using convolutional neural networks for predicting medical conditions, sometimes even surpassing that of medical professionals. Despite their great performance, convolutional neural networks operate as black boxes, potentially arriving at correct conclusions for incorrect reasons or areas of focus. Our work explores the possibility of mitigating this phenomenon by identifying and occluding confounding variables within images. Specifically, we focused on the prediction of osteopenia, a serious medical condition, using the publicly available GRAZPEDWRI-DX dataset. After detection of the confounding variables in the dataset, we generated masks that occlude regions of images associated with those variables. By doing so, models were forced to focus on different parts of the images for classification. Model evaluation using F1-score, precision, and recall showed that models trained on non-occluded images typically outperformed models trained on occluded images. However, a test where radiologists had to choose a model based on the focused regions extracted by the GRAD-CAM method showcased different outcomes. The radiologists' preference shifted towards models trained on the occluded images. These results suggest that while occluding confounding variables may degrade model performance, it enhances interpretability, providing more reliable insights into the reasoning behind predictions. The code to repeat our experiment is available on the following link: https://github.com/mikulicmateo/osteopenia .

Keyphrases