Spectral averaging with outlier rejection algorithms to increase identifications in top-down proteomics.
Austin V CarrNicholas E BollisJohn G PavekMichael R ShortreedLloyd M SmithPublished in: Proteomics (2024)
The identification of proteoforms by top-down proteomics requires both high quality fragmentation spectra and the neutral mass of the proteoform from which the fragments derive. Intact proteoform spectra can be highly complex and may include multiple overlapping proteoforms, as well as many isotopic peaks and charge states. The resulting lower signal-to-noise ratios for intact proteins complicates downstream analyses such as deconvolution. Averaging multiple scans is a common way to improve signal-to-noise, but mass spectrometry data contains artifacts unique to it that can degrade the quality of an averaged spectra. To overcome these limitations and increase signal-to-noise, we have implemented outlier rejection algorithms to remove outlier measurements efficiently and robustly in a set of MS1 scans prior to averaging. We have implemented averaging with rejection algorithms in the open-source, freely available, proteomics search engine MetaMorpheus. Herein, we report the application of the averaging with rejection algorithms to direct injection and online liquid chromatography mass spectrometry data. Averaging with rejection algorithms demonstrated a 45% increase in the number of proteoforms detected in Jurkat T cell lysate. We show that the increase is due to improved spectral quality, particularly in regions surrounding isotopic envelopes.
Keyphrases
- mass spectrometry
- liquid chromatography
- machine learning
- deep learning
- gas chromatography
- big data
- high performance liquid chromatography
- high resolution mass spectrometry
- capillary electrophoresis
- air pollution
- tandem mass spectrometry
- high resolution
- computed tomography
- artificial intelligence
- electronic health record
- optical coherence tomography
- dual energy
- density functional theory
- simultaneous determination
- multiple sclerosis
- social media
- quality improvement
- magnetic resonance
- label free
- image quality
- solar cells