Login / Signup

The pitfalls and virtues of population genetic summary statistics: Detecting selective sweeps in recent divergences.

Kevin SchneiderTom J WhiteSonia MitchellColin E AdamsRichard ReeveKathryn R Elmer
Published in: Journal of evolutionary biology (2020)
During evolution, genomes are shaped by a plethora of forces that can leave characteristic signatures. A common goal when studying diverging populations is to detect the signatures of selective sweeps, which can be rather difficult in complex demographic scenarios, such as under secondary contact. Moreover, the detection of selective sweeps, especially in whole-genome data, often relies heavily on a narrow set of summary statistics that are affected by a multitude of factors, frequently leading to false positives and false negatives. Simulating genomic regions makes it possible to control these demographic and population genetic factors. We used simulations of large genomic regions to determine how different secondary contact and sympatric speciation scenarios affect the footprint of hard and soft selective sweeps in the presence of varying degrees of gene flow and recombination. We explored the ability of an array of population genetic summary statistics to detect the footprints of these selective sweeps under specific demographies. We focussed on metrics that do not require phased data or ancestral sequences and therefore have wide applicability. We found that a newly developed beta diversity measure, B ¯ G D utperformed all other metrics in detecting selective sweeps and that FST also performed well. High accuracy was also found in Δ π and genotype distance-derived metrics. The performance of most metrics strongly depended on factors such as the presence of an allopatric phase, migration rates, recombination, population growth, and whether the sweep was hard or soft. We provide suggestions for locating and analysing the response to selective sweeps in whole-genome data.
Keyphrases
  • genome wide
  • copy number
  • electronic health record
  • climate change
  • big data
  • gene expression
  • oxidative stress
  • molecular dynamics
  • genetic diversity