Predictive Minisci late stage functionalization with transfer learning.
Emma King-SmithFelix A FaberUsa ReillyAnton V SinitskiyQingyi YangBo LiuDennis HyekAlpha A LeePublished in: Nature communications (2024)
Structural diversification of lead molecules is a key component of drug discovery to explore chemical space. Late-stage functionalizations (LSFs) are versatile methodologies capable of installing functional handles on richly decorated intermediates to deliver numerous diverse products in a single reaction. Predicting the regioselectivity of LSF is still an open challenge in the field. Numerous efforts from chemoinformatics and machine learning (ML) groups have made strides in this area. However, it is arduous to isolate and characterize the multitude of LSF products generated, limiting available data and hindering pure ML approaches. We report the development of an approach that combines a message passing neural network and 13 C NMR-based transfer learning to predict the atom-wise probabilities of functionalization for Minisci and P450-based functionalizations. We validated our model both retrospectively and with a series of prospective experiments, showing that it accurately predicts the outcomes of Minisci-type and P450 transformations and outperforms the well-established Fukui-based reactivity indices and other machine learning reactivity-based algorithms.
Keyphrases
- machine learning
- drug discovery
- neural network
- big data
- electron transfer
- artificial intelligence
- magnetic resonance
- deep learning
- electronic health record
- high resolution
- quantum dots
- molecular dynamics
- solid state
- quality improvement
- reduced graphene oxide
- type diabetes
- mass spectrometry
- adipose tissue
- highly efficient
- skeletal muscle
- data analysis
- weight loss