Maximum Common Substructure Searching in Combinatorial Make-on-Demand Compound Spaces.
Robert SchmidtRaphael KleinMatthias RareyPublished in: Journal of chemical information and modeling (2021)
Commercial make-on-demand compound spaces have become increasingly popular within the past few years. Since these libraries are too large for enumeration, they are usually accessed using combinatorial fragment space technologies like FTrees-FS and SpaceLight. Although both search types are of high practical impact, they lack the ability to search for precise structural features on the atomic level. To address this important use case, we developed SpaceMACS enabling efficient and precise maximum common induced substructure (MCIS) similarity and substructure searches within chemical fragment spaces. SpaceMACS enumerates a user-defined number of compounds in a multistep procedure. First, substructures of the query are extracted and matched to all fragments of the space. Then partial results are combined to actual compounds of the space. In this way, SpaceMACS identifies common substructures even if they cross fragment borders. We applied SpaceMACS on three commercial fragment spaces searching for the 150 000 most similar analogs to a glucosyltransferase binder from literature. We were able to find almost all building blocks used for the synthesis of the 90 listed analogs and a plethora of additional results. SpaceMACS is the missing link to enable rational drug discovery on make-on-demand combinatorial catalogs. No matter whether initial compound suggestions come from a de novo design, an AI-based compound generation, or a medicinal chemist's drawing board, the method gives access to the structurally closest chemically available analogs in seconds to at most minutes.