Login / Signup

Leveraging Limited Experimental Data with Machine Learning: Differentiating a Methyl from an Ethyl Group in the Corey-Bakshi-Shibata Reduction.

Oliver PereiraMarcel RuthDennis GerbigRaffael Christoph WendePeter Richard Schreiner
Published in: Journal of the American Chemical Society (2024)
We present a case study on how to improve an existing metal-free catalyst for a particularly difficult reaction, namely, the Corey-Bakshi-Shibata (CBS) reduction of butanone, which constitutes the classic and prototypical challenge of being able to differentiate a methyl from an ethyl group. As there are no known strategies on how to address this challenge, we leveraged the power of machine learning by constructing a realistic (for a typical laboratory) small, albeit high-quality, data set of about 100 reactions (run in triplicate) that we used to train a model in combination with a key-intermediate graph (of substrate and catalyst) to predict the differences in Gibbs activation energies ΔΔ G ‡ of the enantiomeric reaction paths. With the help of this model, we were able to select and subsequently screen a small selection of catalysts and increase the selectivity for the CBS reduction of butanone to 80% enantiomeric excess (ee), the highest possible value achieved to date for this substrate with a metal-free catalyst, thereby also exceeding the best available enzymatic systems (64% ee) and the selectivity with Corey's original catalyst (60% ee). This translates into a >50% improvement in relative Δ G ‡ from 0.9 to 1.4 kcal mol -1 . We underscore the transformative potential of machine learning in accelerating catalyst design because we rely on a manageable small data set and a key-intermediate graph representing a combination of catalyst and substrate graphs in lieu of a transition-state model. Our results highlight the synergy of synthetic chemistry and data-centric approaches and provide a blueprint for future catalyst optimization.
Keyphrases