QupKake: Integrating Machine Learning and Quantum Chemistry for Micro-p K a Predictions.
Omri D AbarbanelGeoffrey R HutchisonPublished in: Journal of chemical theory and computation (2024)
Accurate prediction of micro-p K a values is crucial for understanding and modulating the acidity and basicity of organic molecules, with applications in drug discovery, materials science, and environmental chemistry. This work introduces QupKake, a novel method that combines graph neural network models with semiempirical quantum mechanical (QM) features to achieve exceptional accuracy and generalization in micro-p K a prediction. QupKake outperforms state-of-the-art models on a variety of benchmark data sets, with root-mean-square errors between 0.5 and 0.8 p K a units on five external test sets. Feature importance analysis reveals the crucial role of QM features in both the reaction site enumeration and micro-p K a prediction models. QupKake represents a significant advancement in micro-p K a prediction, offering a powerful tool for various applications in chemistry and beyond.