Invariant Molecular Representations for Heterogeneous Catalysis.
Jawad ChowdhuryCharles H FrickeOlajide H BamideleMubarak BelloWenqiang YangAndreas HeydenGabriel TerejanuPublished in: Journal of chemical information and modeling (2024)
Catalyst screening is a critical step in the discovery and development of heterogeneous catalysts, which are vital for a wide range of chemical processes. In recent years, computational catalyst screening, primarily through density functional theory (DFT), has gained significant attention as a method for identifying promising catalysts. However, the computation of adsorption energies for all likely chemical intermediates present in complex surface chemistries is computationally intensive and costly due to the expensive nature of these calculations and the intrinsic idiosyncrasies of the methods or data sets used. This study introduces a novel machine learning (ML) method to learn adsorption energies from multiple DFT functionals by using invariant molecular representations (IMRs). To do this, we first extract molecular fingerprints for the reaction intermediates and later use a Siamese-neural-network-based training strategy to learn invariant molecular representations or the IMR across all available functionals. Our Siamese network-based representations demonstrate superior performance in predicting adsorption energies compared with other molecular representations. Notably, when considering mean absolute values of adsorption energies as 0.43 eV (PBE-D3), 0.46 eV (BEEF-vdW), 0.81 eV (RPBE), and 0.37 eV (scan+rVV10), our IMR method has achieved the lowest mean absolute errors (MAEs) of 0.18 0.10, 0.16, and 0.18 eV, respectively. These results emphasize the superior predictive capacity of our Siamese network-based representations. The empirical findings in this study illuminate the efficacy, robustness, and dependability of our proposed ML paradigm in predicting adsorption energies, specifically for propane dehydrogenation on a platinum catalyst surface.
Keyphrases
- density functional theory
- working memory
- molecular dynamics
- highly efficient
- machine learning
- aqueous solution
- neural network
- metal organic framework
- room temperature
- ionic liquid
- single molecule
- computed tomography
- small molecule
- carbon dioxide
- big data
- emergency department
- artificial intelligence
- high throughput
- oxidative stress
- deep learning
- electronic health record
- single cell