Enabling late-stage drug diversification by high-throughput experimentation with geometric deep learning.

David F NippaKenneth AtzRemo HohlerAlex T MüllerAndreas MarxChristian Bartelmus Georg Wuitschik Irene MarzuoliVera JostJens WolfardMartin BinderAntonia F Stepan David B Konrad Uwe Grether Rainer E Martin Gisbert Schneider

Published in: Nature chemistry (2023)

Late-stage functionalization is an economical approach to optimize the properties of drug candidates. However, the chemical complexity of drug molecules often makes late-stage diversification challenging. To address this problem, a late-stage functionalization platform based on geometric deep learning and high-throughput reaction screening was developed. Considering borylation as a critical step in late-stage functionalization, the computational model predicted reaction yields for diverse reaction conditions with a mean absolute error margin of 4-5%, while the reactivity of novel reactions with known and unknown substrates was classified with a balanced accuracy of 92% and 67%, respectively. The regioselectivity of the major products was accurately captured with a classifier F-score of 67%. When applied to 23 diverse commercial drug molecules, the platform successfully identified numerous opportunities for structural diversification. The influence of steric and electronic information on model performance was quantified, and a comprehensive simple user-friendly reaction format was introduced that proved to be a key enabler for seamlessly integrating deep learning and high-throughput experimentation for late-stage functionalization.

Keyphrases