Reinvestigating the Correctness of Decoy-Based False Discovery Rate Control in Proteomics Tandem Mass Spectrometry.
Jack FreestoneWilliam Stafford NobleUri KeichPublished in: Journal of proteome research (2024)
Traditional database search methods for the analysis of bottom-up proteomics tandem mass spectrometry (MS/MS) data are limited in their ability to detect peptides with post-translational modifications (PTMs). Recently, "open modification" database search strategies, in which the requirement that the mass of the database peptide closely matches the observed precursor mass is relaxed, have become popular as ways to find a wider variety of types of PTMs. Indeed, in one study, Kong et al. reported that the open modification search tool MSFragger can achieve higher statistical power to detect peptides than a traditional "narrow window" database search. We investigated this claim empirically and, in the process, uncovered a potential general problem with false discovery rate (FDR) control in the machine learning postprocessors Percolator and PeptideProphet. This problem might have contributed to Kong et al. 's report that their empirical results suggest that false discovery (FDR) control in the narrow window setting might generally be compromised. Indeed, reanalyzing the same data while using a more standard form of target-decoy competition-based FDR control, we found that, after accounting for chimeric spectra as well as for the inherent difference in the number of candidates in open and narrow searches, the data does not provide sufficient evidence that FDR control in proteomics MS/MS database search is inherently problematic.
Keyphrases
- tandem mass spectrometry
- ultra high performance liquid chromatography
- high performance liquid chromatography
- ms ms
- mass spectrometry
- liquid chromatography
- machine learning
- small molecule
- gas chromatography
- simultaneous determination
- electronic health record
- minimally invasive
- adverse drug
- big data
- high resolution
- liquid chromatography tandem mass spectrometry
- artificial intelligence
- cell therapy
- data analysis