Login / Signup

COMPAS-3: a dataset of peri -condensed polybenzenoid hydrocarbons.

Alexandra WahabRenana Gershoni-Poranne
Published in: Physical chemistry chemical physics : PCCP (2024)
We introduce the third installment of the COMPAS Project - a COMputational database of Polycyclic Aromatic Systems, focused on peri -condensed polybenzenoid hydrocarbons. In this installment, we develop two datasets containing the optimized ground-state structures and a selection of molecular properties of ∼39k and ∼9k peri -condensed polybenzenoid hydrocarbons (at the GFN2-xTB and CAM-B3LYP-D3BJ/cc-pvdz//CAM-B3LYP-D3BJ/def2-SVP levels, respectively). The manuscript details the enumeration and data generation processes and describes the information available within the datasets. An in-depth comparison between the two types of computation is performed, and it is found that the geometrical disagreement is maximal for slightly-distorted molecules. In addition, a data-driven analysis of the structure-property trends of peri -condensed PBHs is performed, highlighting the effect of the size of peri -condensed islands and linearly annulated rings on the HOMO-LUMO gap. The insights described herein are important for rational design of novel functional aromatic molecules for use in, e.g. , organic electronics. The generated datasets provide a basis for additional data-driven machine- and deep-learning studies in chemistry.
Keyphrases
  • deep learning
  • rna seq
  • healthcare
  • electronic health record
  • emergency department
  • high resolution
  • quality improvement
  • artificial intelligence
  • adverse drug
  • data analysis