Mass-Spectrometry-Based Near-Complete Draft of the Saccharomyces cerevisiae Proteome.

Yuan GaoLingyan PingDuc DuongChengpu ZhangEric B DammerYanchang LiPeiru ChenLei ChangHuiying GaoJunzhu WuPing Xu

Published in: Journal of proteome research (2021)

Proteomics approaches designed to catalogue all open reading frames (ORFs) under a defined set of growth conditions of an organism have flourished in recent years. However, no proteome has been sequenced completely so far. Here, we generate the largest yeast proteome data set, including 5610 identified proteins, using a strategy based on optimized sample preparation and high-resolution mass spectrometry. Among the 5610 identified proteins, 94.1% are core proteins, which achieves near-complete coverage of the yeast ORFs. Comprehensive analysis of missing proteins showed that proteins are missed mainly due to physical properties. A review of protein abundance shows that our proteome encompasses a uniquely broad dynamic range. Additionally, these values highly correlate with mRNA abundance, implying a high level of accuracy, sensitivity, and precision. We present examples of how the data could be used, including reannotating gene localization, providing expression evidence of pseudogenes. Our near-complete yeast proteome data set will be a useful and important resource for further systematic studies.

Keyphrases