Login / Signup

Predicting Potent Compounds Using a Conditional Variational Autoencoder Based upon a New Structure-Potency Fingerprint.

Tiago JanelaKosuke TakeuchiJürgen Bajorath
Published in: Biomolecules (2023)
Prediction of the potency of bioactive compounds generally relies on linear or nonlinear quantitative structure-activity relationship (QSAR) models. Nonlinear models are generated using machine learning methods. We introduce a novel approach for potency prediction that depends on a newly designed molecular fingerprint (FP) representation. This structure-potency fingerprint (SPFP) combines different modules accounting for the structural features of active compounds and their potency values in a single bit string, hence unifying structure and potency representation. This encoding enables the derivation of a conditional variational autoencoder (CVAE) using SPFPs of training compounds and apply the model to predict the SPFP potency module of test compounds using only their structure module as input. The SPFP-CVAE approach correctly predicts the potency values of compounds belonging to different activity classes with an accuracy comparable to support vector regression (SVR), representing the state-of-the-art in the field. In addition, highly potent compounds are predicted with very similar accuracy as SVR and deep neural networks.
Keyphrases
  • neural network
  • structure activity relationship
  • molecular dynamics
  • mass spectrometry
  • quality control
  • molecular dynamics simulations