Best practice and reproducible science are required to advance artificial intelligence in real-world applications.

Zhichao LiuTing LiSkylar ConnorShraddha ThakkarRuth RobertsWeida Tong

Published in: Briefings in bioinformatics (2022)

Drug-induced liver injury (DILI) is one of the most significant concerns in medical practice but yet it still cannot be fully recapitulated with existing in vivo, in vitro and in silico approaches. To address this challenge, Chen et al. [ 1] developed a deep learning-based DILI prediction model based on chemical structure information alone. The reported model yielded an outstanding prediction performance (i.e. 0.958, 0.976, 0.935, 0.947, 0.926 and 0.913 for AUC, accuracy, recall, precision, F1-score and specificity, respectively, on a test set), far outperforming all publicly available and similar in silico DILI models. This extraordinary model performance is counter-intuitive to what we know about the underlying biology of DILI and the principles and hypothesis behind this type of in silico approach. In this Letter to the Editor, we raise awareness of several issues concerning data curation, model validation and comparison practices, and data and model reproducibility.

Keyphrases