A learned embedding for efficient joint analysis of millions of mass spectra.

Wout Bittremieux Damon H MayJeffrey BilmesWilliam Stafford Noble

Published in: Nature methods (2022)

Computational methods that aim to exploit publicly available mass spectrometry repositories rely primarily on unsupervised clustering of spectra. Here we trained a deep neural network in a supervised fashion on the basis of previous assignments of peptides to spectra. The network, called 'GLEAMS', learns to embed spectra in a low-dimensional space in which spectra generated by the same peptide are close to one another. We applied GLEAMS for large-scale spectrum clustering, detecting groups of unidentified, proximal spectra representing the same peptide. We used these clusters to explore the dark proteome of repeatedly observed yet consistently unidentified mass spectra.

Keyphrases

density functional theory
mass spectrometry
neural network
machine learning
molecular dynamics
rna seq
liquid chromatography
resistance training