gr Predictor: A Deep Learning Model for Predicting the Hydration Structures around Proteins.
Kosuke KawamaYusaku FukushimaMitsunori IkeguchiMasateru OhtaTakashi YoshidomePublished in: Journal of chemical information and modeling (2022)
Among the factors affecting biological processes such as protein folding and ligand binding, hydration, which is represented by a three-dimensional water site distribution function around the protein, is crucial. The typical methods for computing the distribution functions, including molecular dynamics simulations and the three-dimensional reference interaction site model (3D-RISM) theory, require a long computation time ranging from hours to tens of hours. Here, we propose a deep learning (DL) model that rapidly estimates the distribution functions around proteins obtained using the 3D-RISM theory from the protein 3D structure. The distribution functions predicted using our DL model are in good agreement with those obtained using the 3D-RISM theory. Particularly, the coefficient of determination between the distribution function obtained by the DL model and that obtained using the 3D-RISM theory is approximately 0.98. Furthermore, using a graphics processing unit, the prediction by the DL model is completed in less than 1 min, more than 2 orders of magnitude faster than the calculation time of the 3D-RISM theory. The position of water molecules around the protein was estimated based on the distribution function obtained by our DL model, and the position of waters estimated by our DL model was in good agreement with that of water molecules estimated using the 3D-RISM theory and of crystallographic waters. Therefore, our DL model provides a practical and efficient way to calculate the three-dimensional water site distribution functions and to estimate the position of water molecules around the protein. The program called "gr Predictor" is available under the GNU General Public License from https://github.com/YoshidomeGroup-Hydration/gr-predictor.