Login / Signup

The Cryptic Bacterial Microproteome.

Igor FesenkoHarutyun SahakyanSvetlana A ShabalinaEugene V Koonin
Published in: bioRxiv : the preprint server for biology (2024)
Microproteins encoded by small open reading frames (smORFs) comprise the dark matter of proteomes. Although functional microproteins were identified in diverse organisms from all three domains of life, bacterial smORFs remain poorly characterized. In this comprehensive study of intergenic smORFs (ismORFs, 15 to 70 codons) in 5,668 bacterial genomes of the family Enterobacteriaceae, we identified 67,297 clusters of ismORFs subject to purifying selection. The ismORFs mainly code for hydrophobic, potentially transmembrane, unstructured, or minimally structured microproteins. Using AlphaFold Multimer, we predicted interactions of some of the predicted microproteins encoded by transcribed ismORFs with proteins encoded by neighboring genes, revealing the potential of microproteins to regulate the activity of various proteins, particularly, under stress. We compiled a catalog of predicted microprotein families with different levels of evidence from synteny analysis, structure prediction, and transcription and translation data. This study offers a resource for investigation of biological functions of microproteins.
Keyphrases
  • minimally invasive
  • gene expression
  • machine learning
  • climate change
  • cystic fibrosis
  • deep learning
  • gram negative
  • artificial intelligence
  • klebsiella pneumoniae
  • heat stress