Classifying Protein-Protein Binding Affinity with Free-Energy Calculations and Machine Learning Approaches.
Emma Goulard Coderc de LacamBenoı T RouxChristophe J ChipotPublished in: Journal of chemical information and modeling (2024)
Understanding the intricate phenomenon of neuronal wiring in the brain is of great interest in neuroscience. In the fruit fly, Drosophila melanogaster , the Dpr-DIP interactome has been identified to play an important role in this process. However, experimental data suggest that a merely limited subset of complexes, essentially 57 out of a total of 231, exhibit strong binding affinity. In this work, we sought to identify the residue-level molecular basis underlying the difference in binding affinity using a state-of-the-art methodology consisting of standard binding free-energy calculations with a geometrical route and machine learning (ML) techniques. We determined the binding affinity for two complexes using statistical mechanics simulations, achieving an excellent reproduction of the experimental data. Moreover, we predicted the binding free energy for two additional low-affinity complexes, devoid of experimental estimation, while simultaneously identifying key residues for the binding. Furthermore, through the use of ML algorithms, linear discriminant analysis, and random forest, we achieved remarkable accuracy, as high as 0.99, in discerning between strong (cognate) and weak (noncognate) binders. The presented ML approach encompasses easily transferable input features, enabling its broad application to any interactome while facilitating the identification of pivotal residues critical for binding interactions. The predictive power of the generated model was probed on similar protein families from 13 diverse species. Our ML model exhibited commendable performance on these additional data sets, showcasing its reliability and robustness across the species barrier.