QUAM-AFM: A Free Database for Molecular Identification by Atomic Force Microscopy.
Jaime Carracedo-CosmeCarlos Romero-MuñizPablo PouRubén PérezPublished in: Journal of chemical information and modeling (2022)
This paper introduces Quasar Science Resources-Autonomous University of Madrid atomic force microscopy image data set (QUAM-AFM), the largest data set of simulated atomic force microscopy (AFM) images generated from a selection of 685,513 molecules that span the most relevant bonding structures and chemical species in organic chemistry. QUAM-AFM contains, for each molecule, 24 3D image stacks, each consisting of constant-height images simulated for 10 tip-sample distances with a different combination of AFM operational parameters, resulting in a total of 165 million images with a resolution of 256 × 256 pixels. The 3D stacks are especially appropriate to tackle the goal of the chemical identification within AFM experiments by using deep learning techniques. The data provided for each molecule include, besides a set of AFM images, ball-and-stick depictions, IUPAC names, chemical formulas, atomic coordinates, and map of atom heights. In order to simplify the use of the collection as a source of information, we have developed a graphical user interface that allows the search for structures by CID number, IUPAC name, or chemical formula.
Keyphrases
- atomic force microscopy
- deep learning
- high speed
- single molecule
- convolutional neural network
- artificial intelligence
- big data
- electronic health record
- machine learning
- high resolution
- optical coherence tomography
- public health
- molecular dynamics
- bioinformatics analysis
- human milk
- data analysis
- water soluble
- transition metal