Similar compounds versus similar conformers: complementarity between PubChem 2-D and 3-D neighboring sets.
Sunghwan KimEvan E BoltonStephen H BryantPublished in: Journal of cheminformatics (2016)
The results of our study indicate that, for the majority of the compounds in PubChem, their structural similarity to other compounds can be recognized predominantly by either 2-D or 3-D neighborings, but not by both, showing a strong complementarity between 2-D and 3-D neighboring results. Therefore, despite its heavy requirements for computational resources, 3-D neighboring provides an alternative way in which the user can instantly access structurally similar molecules that cannot be detected if only 2-D neighboring is used.Graphical AbstractThe binned distribution of the neighbor preference indices (NPIs) for all compounds in PubChem (left) has a bimodal shape with two maxima at NPI = ±1 and a minimum at NPI = 0, indicating that structural similarity between compounds in PubChem can be recognized predominantly by either 2-D or 3-D neighborings, but not by both. The NPI histogram for the drug space (right) has a greater fraction of compounds with a strong preference for one neighboring method to the other (at NPI ≈ ±1) as well as compounds with a neutral preference (at NPI ≈ 0), indicating that the drug space is very different from the PubChem space.
Keyphrases