Sampling and Mapping Chemical Space with Extended Similarity Indices.
Kenneth López-PérezEdgar López-LópezJosé Luis Medina-FrancoRamón Alain Miranda-QuintanaPublished in: Molecules (Basel, Switzerland) (2023)
Visualization of the chemical space is useful in many aspects of chemistry, including compound library design, diversity analysis, and exploring structure-property relationships, to name a few. Examples of notable research areas where the visualization of chemical space has strong applications are drug discovery and natural product research. However, the sheer volume of even comparatively small sub-sections of chemical space implies that we need to use approximations at the time of navigating through chemical space. ChemMaps is a visualization methodology that approximates the distribution of compounds in large datasets based on the selection of satellite compounds that yield a similar mapping of the whole dataset when principal component analysis on a similarity matrix is performed. Here, we show how the recently proposed extended similarity indices can help find regions that are relevant to sample satellites and reduce the amount of high-dimensional data needed to describe a library's chemical space.