High-confidence structural annotation of metabolites absent from spectral libraries.
Martin A HoffmannLouis-Félix NothiasMarcus LudwigMarkus FleischauerEmily C GentryMichael WittingPieter C DorresteinKai DührkopSebastian BöckerPublished in: Nature biotechnology (2021)
Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but, typically, only a small fraction of spectra can be matched. Previous in silico methods search in structure databases but cannot distinguish between correct and incorrect annotations. Here we introduce the COSMIC workflow that combines in silico structure database generation and annotation with a confidence score consisting of kernel density P value estimation and a support vector machine with enforced directionality of features. On diverse datasets, COSMIC annotates a substantial number of hits at low false discovery rates and outperforms spectral library search. To demonstrate that COSMIC can annotate structures never reported before, we annotated 12 natural bile acids. The annotation of nine structures was confirmed by manual evaluation and two structures using synthetic standards. In human samples, we annotated and manually validated 315 molecular structures currently absent from the Human Metabolome Database. Application of COSMIC to data from 17,400 metabolomics experiments led to 1,715 high-confidence structural annotations that were absent from spectral libraries.
Keyphrases
- optical coherence tomography
- rna seq
- endothelial cells
- mass spectrometry
- high resolution
- dual energy
- induced pluripotent stem cells
- electronic health record
- big data
- pluripotent stem cells
- single cell
- ms ms
- magnetic resonance imaging
- high throughput
- computed tomography
- adverse drug
- machine learning
- artificial intelligence
- gas chromatography mass spectrometry
- molecular dynamics
- drug induced