An Adaptive Pipeline To Maximize Isobaric Tagging Data in Large-Scale MS-Based Proteomics.

John CorthésyKonstantinos TheofilatosSeferina MavroudiCharlotte MacronOrnella CominettiMona RemlawiFrancesco FerraroAntonio Núñez GalindoMartin KussmannSpiridon LikothanassisLoïc Dayon

Published in: Journal of proteome research (2018)

Isobaric tagging is the method of choice in mass-spectrometry-based proteomics for comparing several conditions at a time. Despite its multiplexing capabilities, some drawbacks appear when multiple experiments are merged for comparison in large sample-size studies due to the presence of missing values, which result from the stochastic nature of the data-dependent acquisition mode. Another indirect cause of data incompleteness might derive from the proteomic-typical data-processing workflow that first identifies proteins in individual experiments and then only quantifies those identified proteins, leaving a large number of unmatched spectra with quantitative information unexploited. Inspired by untargeted metabolomic and label-free proteomic workflows, we developed a quantification-driven bioinformatic pipeline (Quantify then Identify (QtI)) that optimizes the processing of isobaric tandem mass tag (TMT) data from large-scale studies. This pipeline includes innovative features, such as peak filtering with a self-adaptive preprocessing pipeline optimization method, Peptide Match Rescue, and Optimized Post-Translational Modification. QtI outperforms a classical benchmark workflow in terms of quantification and identification rates, significantly reducing missing data while preserving unmatched features for quantitative comparison. The number of unexploited tandem mass spectra was reduced by 77 and 62% for two human cerebrospinal fluid and plasma data sets, respectively.

Keyphrases