Login / Signup

The COMPAS Project: A Computational Database of Polycyclic Aromatic Systems. Phase 1: cata -Condensed Polybenzenoid Hydrocarbons.

Alexandra WahabLara PfudererEno PaenurkRenana Gershoni-Poranne
Published in: Journal of chemical information and modeling (2022)
Chemical databases are an essential tool for data-driven investigation of structure-property relationships and for the design of novel functional compounds. We introduce the first phase of the COMPAS Project─a COMputational database of Polycyclic Aromatic Systems. In this phase, we developed two data sets containing the optimized ground-state structures and a selection of molecular properties of ∼34k and ∼9k cata -condensed polybenzenoid hydrocarbons (at the GFN2-xTB and B3LYP-D3BJ/def2-SVP levels, respectively) and placed them in the public domain. Herein, we describe the process of the data set generation, detail the information available within the data sets, and show the fundamental features of the generated data. We analyze the correlation between the two types of computations as well as the structure-property relationships of the calculated species. The data and insights gained from them can inform rational design of novel functional aromatic molecules for use in, e.g., organic electronics, and can provide a basis for additional data-driven machine- and deep-learning studies in chemistry.
Keyphrases
  • electronic health record
  • big data
  • deep learning
  • healthcare
  • quality improvement
  • emergency department
  • artificial intelligence
  • machine learning
  • mental health
  • data analysis
  • adverse drug
  • single molecule
  • neural network