Data-Centric Heterogeneous Catalysis: Identifying Rules and Materials Genes of Alkane Selective Oxidation.
Lucas FoppaFrederik RütherMichael GeskeGregor KochFrank GirgsdiesPierre KubeSpencer J CareyMichael HäveckerOlaf TimpeAndrey V TarasovMatthias SchefflerFrank RosowskiRobert SchlöglAnnette TrunschkePublished in: Journal of the American Chemical Society (2023)
Artificial intelligence (AI) can accelerate catalyst design by identifying key physicochemical descriptive parameters correlated with the underlying processes triggering, favoring, or hindering the performance. In analogy to genes in biology, these parameters might be called "materials genes" of heterogeneous catalysis. However, widely used AI methods require big data, and only the smallest part of the available data meets the quality requirement for data-efficient AI. Here, we use rigorous experimental procedures, designed to consistently take into account the kinetics of the catalyst active states formation, to measure 55 physicochemical parameters as well as the reactivity of 12 catalysts toward ethane, propane, and n -butane oxidation reactions. These materials are based on vanadium or manganese redox-active elements and present diverse phase compositions, crystallinities, and catalytic behaviors. By applying the sure-independence-screening-and-sparsifying-operator symbolic-regression approach to the consistent data set, we identify nonlinear property-function relationships depending on several key parameters and reflecting the intricate interplay of processes that govern the formation of olefins and oxygenates: local transport, site isolation, surface redox activity, adsorption, and the material dynamical restructuring under reaction conditions. These processes are captured by parameters derived from N 2 adsorption, X-ray photoelectron spectroscopy (XPS), and near-ambient-pressure in situ XPS. The data-centric approach indicates the most relevant characterization techniques to be used for catalyst design and provides "rules" on how the catalyst properties may be tuned in order to achieve the desired performance.
Keyphrases
- big data
- artificial intelligence
- machine learning
- deep learning
- electronic health record
- visible light
- highly efficient
- ionic liquid
- room temperature
- genome wide
- high resolution
- magnetic resonance imaging
- air pollution
- electron transfer
- quality improvement
- gene expression
- computed tomography
- data analysis
- genome wide identification