Login / Signup

Genome size distributions in bacteria and archaea are strongly linked to evolutionary history at broad phylogenetic scales.

Carolina A Martinez-GutierrezFrank O'Neill Aylward
Published in: PLoS genetics (2022)
The evolutionary forces that determine genome size in bacteria and archaea have been the subject of intense debate over the last few decades. Although the preferential loss of genes observed in prokaryotes is explained through the deletional bias, factors promoting and preventing the fixation of such gene losses often remain unclear. Importantly, statistical analyses on this topic typically do not consider the potential bias introduced by the shared ancestry of many lineages, which is critical when using species as data points because of the potential dependence on residuals. In this study, we investigated the genome size distributions across a broad diversity of bacteria and archaea to evaluate if this trait is phylogenetically conserved at broad phylogenetic scales. After model fit, Pagel's lambda indicated a strong phylogenetic signal in genome size data, suggesting that the diversification of this trait is influenced by shared evolutionary histories. We used a phylogenetic generalized least-squares analysis (PGLS) to test whether phylogeny influences the predictability of genome size from dN/dS ratios and 16S copy number, two variables that have been previously linked to genome size. These results confirm that failure to account for evolutionary history can lead to biased interpretations of genome size predictors. Overall, our results indicate that although bacteria and archaea can rapidly gain and lose genetic material through gene transfers and deletions, respectively, phylogenetic signal for genome size distributions can still be recovered at broad phylogenetic scales that should be taken into account when inferring the drivers of genome size evolution.
Keyphrases
  • genome wide
  • copy number
  • dna methylation
  • high resolution
  • electronic health record
  • artificial intelligence
  • mass spectrometry
  • machine learning
  • human health