Comparison of Combinatorial Fragment Spaces and Its Application to Ultralarge Make-on-Demand Compound Catalogs.
Louis BellmannPatrick PennerMarcus GastreichMatthias RareyPublished in: Journal of chemical information and modeling (2022)
The set of chemical compounds shared by two or more chemical libraries is assessed routinely as means of comparing these libraries for various applications. Traditionally this is achieved by comparing the members of the chemical libraries individually for identity. This approach becomes impractical when operating on chemical libraries exceeding billions or even trillions of compounds in size. As a result, no such analysis exists for ultralarge chemical spaces like the Enamine REAL Space containing over 20 billion compounds. In this work, we present a novel tool called SpaceCompare for the overlap calculation of large, nonenumerable combinatorial fragment spaces. In contrast to existing methods, SpaceCompare utilizes topological fingerprints and the combinatorial character of these chemical spaces. The tool is able to determine the exact overlap of prominent spaces like Enamine's REAL Space, WuXi's GalaXi Space, and Otava's CHEMriya for the first time.