Login / Signup

Digitizing chemical discovery with a Bayesian explorer for interpreting reactivity data.

S Hessam M MehrDario CaramelliLeroy Cronin
Published in: Proceedings of the National Academy of Sciences of the United States of America (2023)
Interpreting the outcome of chemistry experiments consistently is slow and frequently introduces unwanted hidden bias. This difficulty limits the scale of collectable data and often leads to exclusion of negative results, which severely limits progress in the field. What is needed is a way to standardize the discovery process and accelerate the interpretation of high-dimensional data aided by the expert chemist's intuition. We demonstrate a digital Oracle that interprets chemical reactivity using probability. By carrying out >500 reactions covering a large space and retaining both the positive and negative results, the Oracle was able to rediscover eight historically important reactions including the aldol condensation, Buchwald-Hartwig amination, Heck, Mannich, Sonogashira, Suzuki, Wittig, and Wittig-Horner reactions. This paradigm for decoding reactivity validates and formalizes the expert chemist's experience and intuition, providing a quantitative criterion of discovery scalable to all available experimental data.
Keyphrases
  • electronic health record
  • small molecule
  • big data
  • high throughput
  • clinical practice
  • machine learning
  • data analysis
  • artificial intelligence