EnzyKR: a chirality-aware deep learning model for predicting the outcomes of the hydrolase-catalyzed kinetic resolution.
Xinchun RanYaoyukun JiangQianzhen ShaoZhongyue J YangPublished in: Chemical science (2023)
Hydrolase-catalyzed kinetic resolution is a well-established biocatalytic process. However, the computational tools that predict favorable enzyme scaffolds for separating a racemic substrate mixture are underdeveloped. To address this challenge, we trained a deep learning framework, EnzyKR, to automate the selection of hydrolases for stereoselective biocatalysis. EnzyKR adopts a classifier-regressor architecture that first identifies the reactive binding conformer of a substrate-hydrolase complex, and then predicts its activation free energy. A structure-based encoding strategy was used to depict the chiral interactions between hydrolases and enantiomers. Different from existing models trained on protein sequences and substrate SMILES strings, EnzyKR was trained using 204 substrate-hydrolase complexes, which were constructed by docking. EnzyKR was tested using a held-out dataset of 20 complexes on the task of predicting activation free energy. EnzyKR achieved a Pearson correlation coefficient ( R ) of 0.72, a Spearman rank correlation coefficient (Spearman R ) of 0.72, and a mean absolute error (MAE) of 1.54 kcal mol -1 in this task. Furthermore, EnzyKR was tested on the task of predicting enantiomeric excess ratios for 28 hydrolytic kinetic resolution reactions catalyzed by fluoroacetate dehalogenase RPA1163, halohydrin HheC, A. mediolanus epoxide hydrolase, and P. fluorescens esterase. The performance of EnzyKR was compared against that of a recently developed kinetic predictor, DLKcat. EnzyKR correctly predicts the favored enantiomer and outperforms DLKcat in 18 out of 28 reactions, occupying 64% of the test cases. These results demonstrate EnzyKR to be a new approach for prediction of enantiomeric outcomes in hydrolase-catalyzed kinetic resolution reactions.
Keyphrases
- deep learning
- room temperature
- single molecule
- resistance training
- capillary electrophoresis
- amino acid
- artificial intelligence
- gene expression
- protein protein
- wastewater treatment
- machine learning
- type diabetes
- magnetic resonance imaging
- convolutional neural network
- structural basis
- computed tomography
- magnetic resonance
- skeletal muscle
- body composition
- glycemic control
- dna methylation