Login / Signup

Multitask CapsNet: An Imbalanced Data Deep Learning Method for Predicting Toxicants.

Yiwei WangBinyou WangJie JiangJianmin GuoJia LaiXiao-Yuan LianJianming Wu
Published in: ACS omega (2021)
Drug development has a high failure rate, with safety properties constituting a considerable challenge. To reduce risk, in silico tools, including various machine learning methods, have been applied for toxicity prediction. However, these approaches often confront a serious problem: the training data sets are usually biased (imbalanced positive and negative samples), which would result in model training difficulty and unsatisfactory prediction accuracy. Multitask networks obtained significantly better predictive accuracies than single-task methods, and capsule neural networks showed excellent performance in sparse data sets in previous studies. In this study, we developed a new multitask framework based on a capsule neural network (multitask CapsNet) to measure 12 different toxic effects simultaneously. We found that multitask CapsNet excelled in toxicity prediction and outperformed many other computational approaches using the multitask strategy. Only after training on biased data sets did multitask CapsNet achieve significantly improved prediction accuracy on the Tox21 Data Challenge, which gave the largest ratio of highest accuracy (8/12) among compared models. Our model gave a prediction accuracy of 96.6% for the target NR.PPAR.gamma, whose ratio of negative to positive samples was up to 36:1. These results suggested that multitask CapsNet could overcome the bias problems and would provide a novel, accurate, and efficient approach for predicting the toxicities of compounds.
Keyphrases
  • neural network
  • electronic health record
  • big data
  • machine learning
  • deep learning
  • oxidative stress
  • artificial intelligence
  • type diabetes
  • insulin resistance
  • fatty acid
  • molecular dynamics simulations