Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.
Yang LiuHansaim LimLei XiePublished in: BMC bioinformatics (2022)
To better exploit chemical structures as an input for machine learning algorithms, we proposed a self-supervised graph neural network-based embedding method that can encode substructure information. Furthermore, we developed a model agnostic self-training method, PLANS, that can be applied to any deep learning architectures to improve prediction accuracies. PLANS provided a way to better utilize partially labeled and unlabeled data. Comprehensive benchmark studies demonstrated their potentials in predicting drug metabolism and toxicity profiles using sparse, noisy, and imbalanced data. PLANS-GINFP could serve as a general solution to improve the predictive modeling for QSAR modeling.
Keyphrases
- machine learning
- neural network
- big data
- deep learning
- artificial intelligence
- health insurance
- electronic health record
- convolutional neural network
- pet imaging
- virtual reality
- high resolution
- oxidative stress
- emergency department
- molecular dynamics
- health information
- healthcare
- computed tomography
- adverse drug
- social media
- medical education