Epik: p K a and Protonation State Prediction through Machine Learning.
Ryne C JohnstonKun YaoZachary KaplanMonica ChelliahKarl LeswingSean SeekinsShawn WattsDavid CalkinsJackson Chief ElkSteven V JeromeMatthew P RepaskyJohn C ShelleyPublished in: Journal of chemical theory and computation (2023)
Epik version 7 is a software program that uses machine learning for predicting the p K a values and protonation state distribution of complex, druglike molecules. Using an ensemble of atomic graph convolutional neural networks (GCNNs) trained on over 42,000 p K a values across broad chemical space from both experimental and computed origins, the model predicts p K a values with 0.42 and 0.72 p K a unit median absolute and root mean square errors, respectively, across seven test sets. Epik version 7 also generates protonation states and recovers 95% of the most populated protonation states compared to previous versions. Requiring on average only 47 ms per ligand, Epik version 7 is rapid and accurate enough to evaluate protonation states for crucial molecules and prepare ultra-large libraries of compounds to explore vast regions of chemical space. The simplicity and time required for the training allow for the generation of highly accurate models customized to a program's specific chemistry.
Keyphrases
- convolutional neural network
- machine learning
- deep learning
- psychometric properties
- high resolution
- quality improvement
- artificial intelligence
- mass spectrometry
- emergency department
- big data
- multiple sclerosis
- patient safety
- ms ms
- computed tomography
- body composition
- magnetic resonance imaging
- virtual reality
- drug discovery
- sensitive detection
- resistance training