Protein-Level Integration Strategy of Multiengine MS Spectra Search Results for Higher Confidence and Sequence Coverage.
Panpan ZhaoJiayong ZhongWanting LiuJing ZhaoGong ZhangPublished in: Journal of proteome research (2017)
Multiple search engines based on various models have been developed to search MS/MS spectra against a reference database, providing different results for the same data set. How to integrate these results efficiently with minimal compromise on false discoveries is an open question due to the lack of an independent, reliable, and highly sensitive standard. We took the advantage of the translating mRNA sequencing (RNC-seq) result as a standard to evaluate the integration strategies of the protein identifications from various search engines. We used seven mainstream search engines (Andromeda, Mascot, OMSSA, X!Tandem, pFind, InsPecT, and ProVerB) to search the same label-free MS data sets of human cell lines Hep3B, MHCCLM3, and MHCC97H from the Chinese C-HPP Consortium for Chromosomes 1, 8, and 20. As expected, the union of seven engines resulted in a boosted false identification, whereas the intersection of seven engines remarkably decreased the identification power. We found that identifications of at least two out of seven engines resulted in maximizing the protein identification power while minimizing the ratio of suspicious/translation-supported identifications (STR), as monitored by our STR index, based on RNC-Seq. Furthermore, this strategy also significantly improves the peptides coverage of the protein amino acid sequence. In summary, we demonstrated a simple strategy to significantly improve the performance for shotgun mass spectrometry by protein-level integrating multiple search engines, maximizing the utilization of the current MS spectra without additional experimental work.
Keyphrases
- amino acid
- mass spectrometry
- ms ms
- multiple sclerosis
- protein protein
- binding protein
- label free
- single cell
- liquid chromatography
- genome wide
- electronic health record
- endothelial cells
- healthcare
- dna methylation
- density functional theory
- machine learning
- small molecule
- capillary electrophoresis
- adverse drug
- induced pluripotent stem cells
- molecular dynamics
- affordable care act
- tandem mass spectrometry