Cocrystals in the Cambridge Structural Database: a network approach.
Jan Joris DevogelaerHugo MeekesElias VliegRené de GelderPublished in: Acta crystallographica Section B, Structural science, crystal engineering and materials (2019)
To obtain a better understanding of which coformers to combine for the successful formation of a cocrystal, techniques from data mining and network science are used to analyze the data contained in the Cambridge Structural Database (CSD). A network of coformers is constructed based on cocrystal entries present in the CSD and its properties are analyzed. From this network, clusters of coformers with a similar tendency to form cocrystals are extracted. The popularity of the coformers in the CSD is unevenly distributed: a small group of coformers is responsible for most of the cocrystals, hence resulting in an inherently biased data set. The coformers in the network are found to behave primarily in a bipartite manner, demonstrating the importance of combining complementary coformers for successful cocrystallization. Based on our analysis, it is demonstrated that the CSD coformer network is a promising source of information for knowledge-based cocrystal prediction.