MealTime-MS: A Machine Learning-Guided Real-Time Mass Spectrometry Analysis for Protein Identification and Efficient Dynamic Exclusion.
Alexander R PelletierYun-En ChungZhibin NingNora WongDaniel FigeysMathieu Lavallée-AdamPublished in: Journal of the American Society for Mass Spectrometry (2020)
Mass spectrometry-based proteomics technologies are prime methods for the high-throughput identification of proteins in complex biological samples. Nevertheless, there are still technical limitations that hinder the ability of mass spectrometry to identify low abundance proteins in complex samples. Characterizing such proteins is essential to provide a comprehensive understanding of the biological processes taking place in cells and tissues. Still today, most mass spectrometry-based proteomics approaches use a data-dependent acquisition strategy, which favors the collection of mass spectra from proteins of higher abundance. Since the computational identification of proteins from proteomics data is typically performed after mass spectrometry analysis, large numbers of mass spectra are typically redundantly acquired from the same abundant proteins, and little to no mass spectra are acquired for proteins of lower abundance. We therefore propose a novel supervised learning algorithm, MealTime-MS, that identifies proteins in real-time as mass spectrometry data are acquired and prevents further data collection from confidently identified proteins to ultimately free mass spectrometry resources to improve the identification sensitivity of low abundance proteins. We use real-time simulations of a previously performed mass spectrometry analysis of a HEK293 cell lysate to show that our approach can identify 92.1% of the proteins detected in the experiment using 66.2% of the MS2 spectra. We also demonstrate that our approach outperforms a previously proposed method, is sufficiently fast for real-time mass spectrometry analysis, and is flexible. Finally, MealTime-MS' efficient usage of mass spectrometry resources will provide a more comprehensive characterization of proteomes in complex samples.
Keyphrases
- mass spectrometry
- liquid chromatography
- gas chromatography
- capillary electrophoresis
- high performance liquid chromatography
- machine learning
- high resolution
- high throughput
- big data
- tandem mass spectrometry
- multiple sclerosis
- stem cells
- genome wide
- cell death
- molecular dynamics
- induced apoptosis
- simultaneous determination
- wastewater treatment
- endoplasmic reticulum stress