Impact of the Volume and Distribution of Training Datasets in the Development of Deep-Learning Models for the Diagnosis of Colorectal Polyps in Endoscopy Images.
Eun-Jeong GongChang-Seok BangJae Jun LeeYoung Joo YangGwang Ho BaikPublished in: Journal of personalized medicine (2022)
As a result of a data-volume-dependent performance plateau in the classification model of colonoscopy, a dataset that has been doubled or tripled is not always beneficial to training. Deep-learning models would be more accurate if the proportion of fewer category lesions was increased.