Login / Signup

Mapping of long stretches of highly conserved sequences in over 6 million SARS-CoV-2 genomes.

Akhil KumarRishika KaushalHimanshi SharmaKhushboo SharmaManoj B MenonPerumal Vivekanandan
Published in: Briefings in functional genomics (2023)
We identified 11 conserved stretches in over 6.3 million SARS-CoV-2 genomes including all the major variants of concerns. Each conserved stretch is ≥100 nucleotides in length with ≥99.9% conservation at each nucleotide position. Interestingly, six of the eight conserved stretches in ORF1ab overlapped significantly with well-folded experimentally verified RNA secondary structures. Furthermore, two of the conserved stretches were mapped to regions within the S2-subunit that undergo dynamic structural rearrangements during viral fusion. In addition, the conserved stretches were significantly depleted for zinc-finger antiviral protein (ZAP) binding sites, which facilitated the recognition and degradation of viral RNA. These highly conserved stretches in the SARS-CoV-2 genome were poorly conserved at the nucleotide level among closely related β-coronaviruses, thus representing ideal targets for highly specific and discriminatory diagnostic assays. Our findings highlight the role of structural constraints at both RNA and protein levels that contribute to the sequence conservation of specific genomic regions in SARS-CoV-2.
Keyphrases
  • sars cov
  • transcription factor
  • respiratory syndrome coronavirus
  • high resolution
  • copy number
  • small molecule
  • amino acid
  • genome wide
  • mass spectrometry
  • high throughput
  • coronavirus disease
  • protein protein
  • high density