Login / Signup

An evaluation of the methods to estimate effective population size from measures of linkage disequilibrium.

Luis Alberto García-CortésFrederic AusterlitzM Angeles R de Cara
Published in: Journal of evolutionary biology (2019)
In 1971, John Sved derived an approximate relationship between linkage disequilibrium (LD) and effective population size for an ideal finite population. This seminal work was extended by Sved and Feldman (Theor Pop Biol 4, 129, 1973) and Weir and Hill (Genetics 95, 477, 1980) who derived additional equations with the same purpose. These equations yield useful estimates of effective population size, as they require a single sample in time. As these estimates of effective population size are now commonly used on a variety of genomic data, from arrays of single nucleotide polymorphisms to whole genome data, some authors have investigated their bias through simulation studies and proposed corrections for different mating systems. However, the cause of the bias remains elusive. Here, we show the problems of using LD as a statistical measure and, analogously, the problems in estimating effective population size from such measure. For that purpose, we compare three commonly used approaches with a transition probability-based method that we develop here. It provides an exact computation of LD. We show here that the bias in the estimates of LD and effective population size are partly due to low-frequency markers, tightly linked markers or to a small total number of crossovers per generation. These biases, however, do not decrease when increasing sample size or using unlinked markers. Our results show the issues of such measures of effective population based on LD and suggest which of the method here studied should be used in empirical studies as well as the optimal distance between markers for such estimates.
Keyphrases
  • mental health
  • machine learning
  • genome wide
  • deep learning
  • density functional theory