Enrichment-Based Proteogenomics Identifies Microproteins, Missing Proteins, and Novel smORFs in Saccharomyces cerevisiae.
Cuitong HeChenxi JiaYao ZhangPing XuPublished in: Journal of proteome research (2018)
Microproteins are peptides composed of 100 amino acids (AA) or fewer, encoded by small open reading frames (smORFs). It has been demonstrated that microproteins participate in and regulate a wide range of functions in cells. However, the annotation and identification of microproteins is challenging in part owing to their low molecular weight, low abundancy, and hydrophobicity. These factors have led to the unannotation of smORFs in genome processing and have made their identification at the protein level difficult. Large-scale enrichment of microproteins in proteogenomics has made it possible to efficiently identify microproteins and discover unannotated smORFs in Saccharomyces cerevisiae. We integrated four microprotein-specific enrichment strategies to enhance coverage. We identified 117 microproteins, verified 31 missing proteins (MPs), and discovered 3 novel smORFs. In total, 31 proteins were confirmed as MPs by spectrum quality checking. Three novel smORFs (YKL104W-A, YHR052C-B, and YHR054C-B) were reserved after spectrum quality checking, peptide synthesizing, homologue matching, and so on. This study not only demonstrates that there are potential smORF candidates to be annotated in an extensively studied organism but also presents an efficient strategy for the discovery of small MPs. All MS data sets have been deposited to the ProteomeXchange with identifier PXD008586.
Keyphrases
- saccharomyces cerevisiae
- amino acid
- induced apoptosis
- small molecule
- mass spectrometry
- multiple sclerosis
- minimally invasive
- healthcare
- ms ms
- risk assessment
- high throughput
- gene expression
- protein protein
- cell proliferation
- artificial intelligence
- bioinformatics analysis
- cell cycle arrest
- signaling pathway
- climate change
- health insurance
- affordable care act
- rna seq