Counting with DNA in metabarcoding studies: How should we convert sequence reads to dietary data?

Bruce Emerson DeagleAusten C ThomasJulie C McInnes Laurence John Clarke Eero Juhani Vesterinen Elizabeth L ClareTyler R KartzinelJ Paige Eveson

Published in: Molecular ecology (2018)

Advances in DNA sequencing technology have revolutionized the field of molecular analysis of trophic interactions, and it is now possible to recover counts of food DNA sequences from a wide range of dietary samples. But what do these counts mean? To obtain an accurate estimate of a consumer's diet should we work strictly with data sets summarizing frequency of occurrence of different food taxa, or is it possible to use relative number of sequences? Both approaches are applied to obtain semi-quantitative diet summaries, but occurrence data are often promoted as a more conservative and reliable option due to taxa-specific biases in recovery of sequences. We explore representative dietary metabarcoding data sets and point out that diet summaries based on occurrence data often overestimate the importance of food consumed in small quantities (potentially including low-level contaminants) and are sensitive to the count threshold used to define an occurrence. Our simulations indicate that using relative read abundance (RRA) information often provides a more accurate view of population-level diet even with moderate recovery biases incorporated; however, RRA summaries are sensitive to recovery biases impacting common diet taxa. Both approaches are more accurate when the mean number of food taxa in samples is small. The ideas presented here highlight the need to consider all sources of bias and to justify the methods used to interpret count data in dietary metabarcoding studies. We encourage researchers to continue addressing methodological challenges and acknowledge unanswered questions to help spur future investigations in this rapidly developing area of research.

Keyphrases