Login / Signup

Density Prediction Models for Energetic Compounds Merely Using Molecular Topology.

Chunming YangJie ChenRunwen WangMiao ZhangChaoyang ZhangJian Liu
Published in: Journal of chemical information and modeling (2021)
Newly developed high-throughput methods for property predictions make the process of materials design faster and more efficient. Density is an important physical property for energetic compounds to assess detonation velocity and detonation pressure, but the time cost of recent density prediction models is still high owing to the time-consuming processes to calculate molecular descriptors. To improve the screening efficiency of potential energetic compounds, new methods for density prediction with more accuracy and less time cost are urgently needed, and a possible solution is to establish direct mappings between the molecular structure and density. We propose three machine learning (ML) models, support vector machine (SVM), random forest (RF), and Graph neural network (GNN), using molecular topology as the only known input. The widely applied quantitative structure-property relationship based on the density functional theory (DFT-QSPR) is adopted as the benchmark to evaluate the accuracies of the models. All these four models are trained and tested by using the same data set enclosing over 2000 reported nitro compounds searched out from the Cambridge Structural Database. The proportions of compounds with prediction error less than 5% are evaluated by using the independent test set, and the values for the models of SVM, RF, DFT-QSPR, and GNN are 48, 63, 85, and 88%, respectively. The results show that, for the models of SVM and RF, fingerprint bit vectors alone are not facilitated to obtain good QSPRs. Mapping between the molecular structure and density can be well established by using GNN and molecular topology, and its accuracy is slightly better than that of the time-consuming DFT-QSPR method. The GNN-based model has higher accuracy and lower computational resource cost than the widely accepted DFT-QSPR model, so it is more suitable for high-throughput screening of energetic compounds.
Keyphrases