Predicting Experimental Heats of Formation via Deep Learning with Limited Experimental Data.
GuanYa YangWai Yuet ChiuJiang WuYi ZhouShuGuang ChenWeiJun ZhouJiaqi FanGuanHua ChenPublished in: The journal of physical chemistry. A (2022)
When it comes to predicting experimental values of molecular properties with deep learning, the key problem is the lack of sufficient experimental data for training. We propose a method that consists of pretraining a graph neural network that aims to reproduce first-principles quantum mechanical results, followed by fine-tuning of a fully connected neural network against experimental results. The combined pretraining and fine-tuning model is expected to yield molecular properties close to experimental accuracy. This is made possible because first-principles quantum mechanical methods are often qualitatively correct or semiquantitatively accurate; thus, a calibration of the calculation results against high-precision but limited experiment data can improve accuracy greatly. Moreover, the method is highly efficient, as first-principles quantum mechanical calculation is bypassed. To demonstrate this, we apply the combined model to determine the experimental heats of formation of organic molecules made of H, C, O, N, or F atoms (up to 30 atoms), where mere 405 experimental data are used. The overall mean absolute error is 1.8 kcal/mol for these molecules.