Letter to the Editor Regarding Article "Prior to Initiation of Chemotherapy, Can We Predict Breast Tumor Response? Deep Learning Convolutional Neural Networks Approach Using a Breast MRI Tumor Dataset".
Joren BrunekreefPublished in: Journal of imaging informatics in medicine (2024)
The cited article reports on a convolutional neural network trained to predict response to neoadjuvant chemotherapy from pre-treatment breast MRI scans. The proposed algorithm attains impressive performance on the test dataset with a mean Area Under the Receiver-Operating Characteristic curve of 0.98 and a mean accuracy of 88%. In this letter, I raise concerns that the reported results can be explained by inadvertent data leakage between training and test datasets. More precisely, I conjecture that the random split of the full dataset in training and test sets did not occur on a patient level, but rather on the level of 2D MRI slices. This allows the neural network to "memorize" a patient's anatomy and their treatment outcome, as opposed to discovering useful features for treatment response prediction. To provide evidence for these claims, I present results of similar experiments I conducted on a public breast MRI dataset, where I demonstrate that the suspected data leakage mechanism closely reproduces the results reported on in the cited work.
Keyphrases
- convolutional neural network
- deep learning
- contrast enhanced
- neoadjuvant chemotherapy
- neural network
- magnetic resonance imaging
- locally advanced
- diffusion weighted imaging
- artificial intelligence
- computed tomography
- electronic health record
- case report
- healthcare
- machine learning
- squamous cell carcinoma
- mental health
- lymph node
- pulmonary embolism
- radiation therapy
- emergency department
- rectal cancer
- sentinel lymph node
- early stage
- data analysis
- rna seq
- single cell