This review demonstrates that the overall scientific quality of studies conducted to feed artificial intelligence algorithms is low. Some improvement in the design and validation of studies can be made with the development of a standardized guideline for the reproducibility and generalizability of results and, thus, their clinical applications.