Login / Signup

Sample coverage estimation, rarefaction and extrapolation based on sample-based abundance data.

Chun-Huo Chiu
Published in: Ecology (2023)
Sample coverage, the proportion of individuals that belong to observed species in a sample, is a metric used to measure the completeness of a sample. Rather than using equal sample sizes, equal sample coverage has become a widely accepted standard for comparing diversity across multiple assemblages, resulting in a more accurate representation of the true relationship between the richness of the assemblages. In practice, sample-based abundance data is the most frequently used data type for evaluating species diversity. In sample-based abundance data, the sampling unit (such as a plot, net, trap, or transect) is randomly selected from the target area, and the number of individuals for each species observed in the sampled unit is recorded. In this case, the individuals in the sample are no longer randomly and independently sampled, and the Good-Turing estimators of abundance-based sample coverage in reference, rarefied, and extrapolated samples may be severely biased when individuals present a highly spatial aggregation pattern. Here, I derive a novel estimator of abundance-based sample coverage based on the Good-Turing frequency formula. Additionally, a new analytical approach is introduced for enabling smooth coverage-based rarefaction and extrapolation to compare richness among assemblages. The near unbiasedness of the proposed estimator and a less biased richness ratio achieved using the newly developed coverage-based standardizing approach are demonstrated by analyzing three ForestGEO permanent forest plot datasets. This article is protected by copyright. All rights reserved.
Keyphrases
  • healthcare
  • affordable care act
  • electronic health record
  • antibiotic resistance genes
  • microbial community
  • low birth weight
  • climate change
  • rna seq
  • data analysis
  • quality improvement