Login / Signup

IDL-PPBopt: A Strategy for Prediction and Optimization of Human Plasma Protein Binding of Compounds via an Interpretable Deep Learning Method.

Chaofeng LouHongbin YangJiye WangMengting HuangWei-Hua LiGuixia LiuPhilip W LeeYun Tang
Published in: Journal of chemical information and modeling (2022)
The prediction and optimization of pharmacokinetic properties are essential in lead optimization. Traditional strategies mainly depend on the empirical chemical rules from medicinal chemists. However, with the rising amount of data, it is getting more difficult to manually extract useful medicinal chemistry knowledge. To this end, we introduced IDL-PPBopt, a computational strategy for predicting and optimizing the plasma protein binding (PPB) property based on an interpretable deep learning method. At first, a curated PPB data set was used to construct an interpretable deep learning model, which showed excellent predictive performance with a root mean squared error of 0.112 for the entire test set. Then, we designed a detection protocol based on the model and Wilcoxon test to identify the PPB-related substructures (named privileged substructures, PSubs) for each molecule. In total, 22 general privileged substructures (GPSubs) were identified, which shared some common features such as nitrogen-containing groups, diamines with two carbon units, and azetidine. Furthermore, a series of second-level chemical rules for each GPSub were derived through a statistical test and then summarized into substructure pairs. We demonstrated that these substructure pairs were equally applicable outside the training set and accordingly customized the structural modification schemes for each GPSub, which provided alternatives for the optimization of the PPB property. Therefore, IDL-PPBopt provides a promising scheme for the prediction and optimization of the PPB property and would be helpful for lead optimization of other pharmacokinetic properties.
Keyphrases
  • deep learning
  • machine learning
  • convolutional neural network
  • artificial intelligence
  • binding protein
  • big data
  • transcription factor
  • data analysis
  • dna binding
  • label free