Optimization and Evaluation of Site-Identification by Ligand Competitive Saturation (SILCS) as a Tool for Target-Based Ligand Optimization.
Vincent D UstachSirish Kaushik LakkarajuSunhwan JoWenbo YuWenjuan JiangAlexander D MacKerellPublished in: Journal of chemical information and modeling (2019)
Chemical fragment cosolvent sampling techniques have become a versatile tool in ligand-protein binding prediction. Site-identification by ligand competitive saturation (SILCS) is one such method that maps the distribution of chemical fragments on a protein as free energy fields called FragMaps. Ligands are then simulated via Monte Carlo techniques in the field of the FragMaps (SILCS-MC) to predict their binding conformations and relative affinities for the target protein. Application of SILCS-MC using a number of different scoring schemes and MC sampling protocols against multiple protein targets was undertaken to evaluate and optimize the predictive capability of the method. Seven protein targets and 551 ligands with broad chemical variability were used to evaluate and optimize the model to maximize Pearson's correlation coefficient, Pearlman's predictive index, correct relative binding affinity, and root-mean-square error versus the absolute experimental binding affinities. Across the protein-ligand sets, the relative affinities of the ligands were predicted correctly an average of 69% of the time for the highest overall SILCS protocol. Training the FragMap weighting factors using a Bayesian machine learning (ML) algorithm led to an increase to an average 75% relative correct affinity predictions. Furthermore, once the optimal protocol is identified for a specific protein-ligand system average predictabilities of 76% are achieved. The ML algorithm is successful with small training sets of data (30 or more compounds) due to the use of physically correct FragMap weights as priors. Notably, the 76% correct relative prediction rate is similar to or better than free energy perturbation methods that are significantly computationally more expensive than SILCS. The results further support the utility of SILCS as a powerful and computationally accessible tool to support lead optimization and development in drug discovery.