Low-coverage sequencing and Wahlund effect severely bias estimates of inbreeding, heterozygosity and effective population size in North American wolves.
Marty KardosRobin S WaplesPublished in: Molecular ecology (2024)
vonHoldt et al. ((2024), Molecular Ecology, 33, e17231) (vH24) used low-coverage (average ~ 7X read depth) restriction site-associated DNA sequence data to estimate individual inbreeding and heterozygosity, and recent effective population size (N e ), in Great Lakes (GL) and Northern Rocky Mountain (RM) wolves. They concluded that RM heterozygosity rapidly declined between 1991 and 2020, and that N e declined substantially in GL and RM over the last 50 generations. Here, we evaluate the effects of low sequence coverage and sampling strategy on vH24's findings and provide general recommendations for using sequence data to evaluate inbreeding, heterozygosity and N e . Low-coverage sequencing resulted in downwardly biased estimates of individual inbreeding and heterozygosity, and an erroneous large temporal decline in RM heterozygosity due to declining read depth through time. Additionally, vH24's sampling strategy-which combined individuals from several genetically differentiated populations and across at least eight wolf generations-is expected to have resulted in severe downward bias in estimates of recent N e for RM. We recommend using high-coverage sequence data ( ≥ $$ \ge $$ 15-20X) to estimate inbreeding and heterozygosity. Carefully filtering individuals, loci and genotypes, and using genotype imputation or likelihoods can help to minimise bias when low-coverage sequence data must be used. For estimation of contemporary N e , the marginal benefits of using more than 10 3- 10 4 loci are small, so aggressive filtering of loci with low average read depth potentially can retain most individuals without sacrificing much precision. Individuals are relatively more valuable than loci because analyses of contemporary N e should focus on roughly single-generation samples from local breeding populations.