EDock: blind protein-ligand docking by replica-exchange monte carlo simulation.
Wenyi ZhangEric W BellMinghao YinYang ZhangPublished in: Journal of cheminformatics (2020)
Protein-ligand docking is an important approach for virtual screening and protein function annotation. Although many docking methods have been developed, most require a high-resolution crystal structure of the receptor and a user-specified binding site to start. This information is, however, not available for the majority of unknown proteins, including many pharmaceutically important targets. Developing blind docking methods without predefined binding sites and working with low-resolution receptor models from protein structure prediction is thus essential. In this manuscript, we propose a novel Monte Carlo based method, EDock, for blind protein-ligand docking. For a given protein, binding sites are first predicted by sequence-profile and substructure-based comparison searches with initial ligand poses generated by graph matching. Next, replica-exchange Monte Carlo (REMC) simulations are performed for ligand conformation refinement under the guidance of a physical force field coupled with binding-site distance constraints. The method was tested on two large-scale datasets containing 535 protein-ligand pairs. Without specifying binding pockets on the experimental receptor structures, EDock achieves on average a ligand RMSD of 2.03 Å, which compares favorably with state-of-the-art docking methods including DOCK6 (2.68 Å) and AutoDock Vina (3.92 Å). When starting with predicted models from I-TASSER, EDock still generates reasonable docking models, with a success rate 159% and 67% higher than DOCK6 and AutoDock Vina, respectively. Detailed data analyses show that the major advantage of EDock lies in reliable ligand binding site predictions and extensive REMC sampling, which allows for the implementation of multiple van der Waals weightings to accommodate different levels of steric clashes and cavity distortions and therefore enhances the robustness of low-resolution docking with predicted protein structures.