Baseline Model for Predicting Protein-Ligand Unbinding Kinetics through Machine Learning.
Nurlybek AmangeldiulyDmitry S KarlovMaxim V FedorovPublished in: Journal of chemical information and modeling (2020)
Derivation of structure-kinetics relationships can help rational design and development of new small-molecule drug candidates with desired residence times. Efforts are now being directed toward the development of efficient computational methods. Currently, there is a lack of solid, high-throughput binding kinetics prediction approaches on bigger datasets. We present a prediction method for binding kinetics based on the machine learning analysis of protein-ligand structural features, which can serve as a baseline for more sophisticated methods utilizing molecular dynamics (MD). We showed that the random forest algorithm is capable of learning the protein binding site secondary structure and backbone/side-chain features to predict the binding kinetics of protein-ligand complexes but still with inferior performance to that of MD-based descriptor analysis. MD simulations had been applied to a limited number of targets and a series of ligands in terms of kinetics analysis, and we believe that the developed approach may guide new studies. The method was trained on a newly curated database of 501 protein-ligand unbinding rate constants, which can also be used for testing and training the binding kinetics prediction models.