Login / Signup

Is It Possible to Find Needles in a Haystack? Meta-Analysis of 1000+ MS/MS Files Provided by the Russian Proteomic Consortium for Mining Missing Proteins.

Ekaterina PoverennayaOlga KiselevaEkaterina IlgisonisSvetlana E NovikovaArthur T KopylovYuri D IvanovAlexey S KononikhinMikhail V GorshkovNikolay KushlinskiiAlexander ArchakovElena Ponomarenko
Published in: Proteomes (2020)
Despite direct or indirect efforts of the proteomic community, the fraction of blind spots on the protein map is still significant. Almost 11% of human genes encode missing proteins; the existence of which proteins is still in doubt. Apparently, proteomics has reached a stage when more attention and curiosity need to be exerted in the identification of every novel protein in order to expand the unusual types of biomaterials and/or conditions. It seems that we have exhausted the current conventional approaches to the discovery of missing proteins and may need to investigate alternatives. Here, we present an approach to deciphering missing proteins based on the use of non-standard methodological solutions and encompassing diverse MS/MS data, obtained for rare types of biological samples by members of the Russian Proteomic community in the last five years. These data were re-analyzed in a uniform manner by three search engines, which are part of the SearchGUI package. The study resulted in the identification of two missing and five uncertain proteins detected with two peptides. Moreover, 149 proteins were detected with a single proteotypic peptide. Finally, we analyzed the gene expression levels to suggest feasible targets for further validation of missing and uncertain protein observations, which will fully meet the requirements of the international consortium. The MS data are available on the ProteomeXchange platform (PXD014300).
Keyphrases
  • ms ms
  • gene expression
  • healthcare
  • mental health
  • endothelial cells
  • electronic health record
  • mass spectrometry
  • dna methylation
  • big data
  • high throughput
  • working memory
  • data analysis
  • high resolution
  • tissue engineering