Login / Signup

Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.

Yang LiuHansaim LimLei Xie
Published in: BMC bioinformatics (2022)
To better exploit chemical structures as an input for machine learning algorithms, we proposed a self-supervised graph neural network-based embedding method that can encode substructure information. Furthermore, we developed a model agnostic self-training method, PLANS, that can be applied to any deep learning architectures to improve prediction accuracies. PLANS provided a way to better utilize partially labeled and unlabeled data. Comprehensive benchmark studies demonstrated their potentials in predicting drug metabolism and toxicity profiles using sparse, noisy, and imbalanced data. PLANS-GINFP could serve as a general solution to improve the predictive modeling for QSAR modeling.
Keyphrases