Login / Signup

Uncovering Effects from the Structure of Metabarcode Sequences for Metagenetic and Microbiome Analysis.

David C MolikMichael E PfrenderScott J Emrich
Published in: Methods and protocols (2020)
The advent of next-generation sequencing has allowed for higher-throughput determination of which species live within a specific location. Here we establish that three analysis methods for estimating diversity within samples-namely, Operational Taxonomic Units; the newer Amplicon Sequence Variants; and a method commonly found in sequence analysis, minhash-are affected by various properties of these sequence data. Using simulations we show that the presence of Single Nucleotide Polymorphisms and the depth of coverage from each species affect the correlations between these approaches. Through this analysis, we provide insights which would affect the decisions on the application of each method. Specifically, the presence of sequence read errors and variability in sequence read coverage deferentially affects these processing methods.
Keyphrases
  • healthcare
  • gene expression
  • copy number
  • high resolution
  • mass spectrometry
  • big data
  • data analysis
  • health insurance
  • solid phase extraction
  • monte carlo