Rescoring of docking poses under Occam's Razor: are there simpler solutions?
Michael ZheninMalkeet Singh BahiaGilles MarcouAlexandre VarnekHanoch SenderowitzDragos HorvathPublished in: Journal of computer-aided molecular design (2018)
Ligand affinity prediction from docking simulations is usually performed by means of highly empirical and diverse protocols. These protocols often involve the re-scoring of poses generated by a force field (FF) based Hamiltonian to provide either estimated binding affinities-or alternatively, some empirical goodness score. Re-scoring is performed by so-called scoring functions-typically, a reweighted sum of FF terms augmented by additional terms (e.g., desolvation/entropic penalty, hydrophobicity, aromatic interactions etc.). Sometimes, the scoring function actually drives ligand positioning, but often it only operates on the best scoring poses ranked top by the initial ligand positioning tool. In either of these rather intricate scenarios, scoring functions are docking-specific models, and most require machine-learning-based calibration. Therefore, docking simulations are less straightforward when compared to "standard" molecular simulations in which the FF Hamiltonian defines the energy, and affinity emerges as an ensemble average property over pools of representative conformers (i.e., the trajectory). Paraphrasing on Occam's Razor principle, additional model complexity is only acceptable if demonstrated to bring a significant improvement of prediction quality. In this work we therefore examined whether the complexity inherent to scoring functions is indeed justified. For this purpose we compared sampler for multiple protein-ligand entities, a general purpose conformation sampler based on the AMBER/GAFF FF, complemented with continuum solvation terms, with several state of the art docking tools that rely on calibrated scoring functions (Glide, Gold, Autodock-Vina) in terms of its ability to top-rank the actives from large and diverse ligand series associated with various proteins. There is no clear winner of this study, where each program performed well on most of the targets, but also failed with respect to at least one of them. Therefore, a well-parameterized force field with a simple, energy-based ligand ranking protocol appears to be an as effective docking protocol as intricate rescoring strategies based on scoring functions. A tool that can sample the conformational space of the free ligand, the bound ligand and the protein binding site using the same force field may avoid many of the approximations common to contemporary docking protocols and allow e.g., for docking into highly flexible active sites, when current scoring functions are not well suited to estimate receptor strain energies.