Login / Signup

MonteCat: A Basin-Hopping-Inspired Catalyst Descriptor Search Algorithm for Machine Learning Models.

Fernando Garcia-EscobarToshiaki TaniikeKeisuke Takahashi
Published in: Journal of chemical information and modeling (2024)
Proposing relevant catalyst descriptors that can relate the information on a catalyst's composition to its actual performance is an ongoing area in catalyst informatics, as it is a necessary step to improve our understanding on the target reactions. Herein, a small descriptor-engineered data set containing 3289 descriptor variables and the performance of 200 catalysts for the oxidative coupling of methane (OCM) is analyzed, and a descriptor search algorithm based on the workflow of the Basin-hopping optimization methodology is proposed to select the descriptors that better fit a predictive model. The algorithm, which can be considered wrapper in nature, consists of the successive generation of random-based modifications to the descriptor subset used in a regression model and adopting them depending on their effect on the model's score. The results are presented after being tested on linear and Support Vector Regression models with average cross-validation r 2 scores of 0.8268 and 0.6875, respectively.
Keyphrases