Impact of Applicability Domains to Generative Artificial Intelligence.
Maxime LangevinChristoph GrebnerStefan GüssregenSusanne SauerYi LiHans MatterMarc BianciottoPublished in: ACS omega (2023)
Molecular generative artificial intelligence is drawing significant attention in the drug design community, with several experimentally validated proof of concepts already published. Nevertheless, generative models are known for sometimes generating unrealistic, unstable, unsynthesizable, or uninteresting structures. This calls for methods to constrain those algorithms to generate structures in drug-like portions of the chemical space. While the concept of applicability domains for predictive models is well studied, its counterpart for generative models is not yet well-defined. In this work, we empirically examine various possibilities and propose applicability domains suited for generative models. Using both public and internal data sets, we use generative methods to generate novel structures that are predicted to be actives by a corresponding quantitative structure-activity relationships model while constraining the generative model to stay within a given applicability domain. Our work looks at several applicability domain definitions, combining various criteria, such as structural similarity to the training set, similarity of physicochemical properties, unwanted substructures, and quantitative estimate of drug-likeness. We assess the structures generated from both qualitative and quantitative points of view and find that the applicability domain definitions have a strong influence on the drug-likeness of generated molecules. An extensive analysis of our results allows us to identify applicability domain definitions that are best suited for generating drug-like molecules with generative models. We anticipate that this work will help foster the adoption of generative models in an industrial context.