Generation of Tautomers Using Micro-p Ka's.
Mark A WatsonHaoyu S YuArt D BochevarovPublished in: Journal of chemical information and modeling (2019)
Solutions of organic molecules containing one or more heterocycles with conjugated bonds may exist as a mixture of tautomers, but typically only a few of them are significantly populated even though the potential number grows combinatorially with the number of protonation and deprotonation sites. Generating the most stable tautomers from a given input structure is an important and challenging task, and numerous algorithms to tackle it have been proposed in the literature. This work describes a novel approach for tautomer prediction that involves the combined use of molecular mechanics, semiempirical quantum chemistry, and density functional theory. The key idea in our method is to identify the protonation and deprotonation sites using estimated micro-p Ka's for every atom in the molecule as well as in its nearest protonated and deprotonated forms. To generate tautomers in a systematic way with minimal bias, we then consider the full set of tautomers that arise from the combinatorial distribution of all such mobile protons among all protonatable sites, with efficient postprocessing to screen away high-energy species. To estimate the micro-p Ka's, we present a new method designed for the current task, but we emphasize that any alternative method can be used in conjunction with our basic algorithm. Our approach is therefore grounded in the computational prediction of physical properties in aqueous solution, in contrast to other approaches that may rely on the use of hard-coded rules of proton distribution, previously observed tautomerization patterns from a known chemical space, or human input. We present examples of the application of our algorithm to organic and drug-like molecules, with a focus on novel structures where traditional methods are expected to perform worse.
Keyphrases
- density functional theory
- molecular dynamics
- machine learning
- aqueous solution
- deep learning
- endothelial cells
- systematic review
- magnetic resonance
- physical activity
- mental health
- induced pluripotent stem cells
- emergency department
- magnetic resonance imaging
- water soluble
- computed tomography
- neural network
- single molecule
- climate change
- human health
- monte carlo