Coarse-Graining with Equivariant Neural Networks: A Path Toward Accurate and Data-Efficient Models.
Timothy D LoosePatrick G SahrmannThomas S QuGregory A VothPublished in: The journal of physical chemistry. B (2023)
Machine learning has recently entered into the mainstream of coarse-grained (CG) molecular modeling and simulation. While a variety of methods for incorporating deep learning into these models exist, many of them involve training neural networks to act directly as the CG force field. This has several benefits of which the most significant is accuracy. Neural networks can inherently incorporate multibody effects during the calculation of CG forces, and a well-trained neural network force field outperforms pairwise basis sets generated from essentially any methodology. However, this comes at a significant cost. First, these models are typically slower than pairwise force fields, even when accounting for specialized hardware, which accelerates the training and integration of such networks. The second and the focus of this paper is the need for a considerable amount of data to train such force fields. It is common to use 10s of microseconds of molecular dynamics data to train a single CG model, which approaches the point of eliminating the CG model's usefulness in the first place. As we investigate in this work, this "data-hunger" trap from neural networks for predicting molecular energies and forces can be remediated in part by incorporating equivariant convolutional operations. We demonstrate that, for CG water, networks that incorporate equivariant convolutional operations can produce functional models using data sets as small as a single frame of reference data, while networks without these operations cannot.