Login / Signup

MARES, a replicable pipeline and curated reference database for marine eukaryote metabarcoding.

Vanessa ArranzWilliam S PearmanJ David AguirreLibby Liggins
Published in: Scientific data (2020)
The use of DNA metabarcoding to characterise the biodiversity of environmental and community samples has exploded in recent years. However, taxonomic inferences from these studies are contingent on the quality and completeness of the sequence reference database used to characterise sample species-composition. In response, studies often develop custom reference databases to improve species assignment. The disadvantage of this approach is that it limits the potential for database re-use, and the transferability of inferences across studies. Here, we present the MARine Eukaryote Species (MARES) reference database for use in marine metabarcoding studies, created using a transparent and reproducible pipeline. MARES includes all COI sequences available in GenBank and BOLD for marine taxa, unified into a single taxonomy. Our pipeline facilitates the curation of sequences, synonymization of taxonomic identifiers used by different repositories, and formatting these data for use in taxonomic assignment tools. Overall, MARES provides a benchmark COI reference database for marine eukaryotes, and a standardised pipeline for (re)producing reference databases enabling integration and fair comparison of marine DNA metabarcoding results.
Keyphrases
  • adverse drug
  • case control
  • big data
  • circulating tumor
  • healthcare
  • single molecule
  • electronic health record
  • emergency department
  • human health
  • machine learning
  • mental health
  • climate change
  • quality improvement