Login / Signup

Assessment of positive selection across SARS-CoV-2 variants via maximum likelihood.

Carly MiddletonLaura S Kubatko
Published in: PloS one (2023)
Study of the genome of the SARS-CoV-2 virus, particularly with regard to understanding evolution of the virus, is crucial for managing the COVID-19 pandemic. To this end, we sample viral genomes from the GISAID repository and use several of the maximum likelihood approaches implemented in PAML, a collection of open source programs for phylogenetic analyses of DNA and protein sequences, to assess evidence for positive selection in the protein-coding regions of the SARS-CoV-2 genome. Across all major variants identified by June 2021, we find limited evidence for positive selection. In particular, we identify positive selection in a small proportion of sites (5-15%) in the protein-coding region of the spike protein across variants. Most other variants did not show a strong signal for positive selection overall, though there were indications of positive selection in the Delta and Kappa variants for the nucleocapsid protein. We additionally use a forward selection procedure to fit a model that allows branch-specific estimates of selection along a phylogeny relating the variants, and find that there is variation in the selective pressure across variants for the spike protein. Our results highlight the utility of computational approaches for identifying genomic regions under selection.
Keyphrases
  • sars cov
  • copy number
  • protein protein
  • respiratory syndrome coronavirus
  • amino acid
  • binding protein
  • public health
  • genome wide
  • small molecule
  • coronavirus disease