Login / Signup

A pile of pipelines: An overview of the bioinformatics software for metabarcoding data analyses.

Ali HakimzadehAlejandro Abdala AsbunDavide AlbaneseMaria BernardDominik BuchnerBenjamin CallahanJ Gregory CaporasoEmily CurdChristophe DjemielMikael Brandström DurlingVasco ElbrechtZachary GoldHyun S GweonMehrdad HajibabaeiFalk HildebrandVladimir MikryukovEric NormandeauEzgi ÖzkurtJonathan M PalmerGéraldine PascalTeresita M PorterDaniel StraubMartti VasarTomáš VětrovskýHaris ZafeiropoulosSten Anslan
Published in: Molecular ecology resources (2023)
Environmental DNA (eDNA) metabarcoding has gained growing attention as a strategy for monitoring biodiversity in ecology. However, taxa identifications produced through metabarcoding require sophisticated processing of high-throughput sequencing data from taxonomically informative DNA barcodes. Various sets of universal and taxon-specific primers have been developed, extending the usability of metabarcoding across archaea, bacteria and eukaryotes. Accordingly, a multitude of metabarcoding data analysis tools and pipelines have also been developed. Often, several developed workflows are designed to process the same amplicon sequencing data, making it somewhat puzzling to choose one among the plethora of existing pipelines. However, each pipeline has its own specific philosophy, strengths and limitations, which should be considered depending on the aims of any specific study, as well as the bioinformatics expertise of the user. In this review, we outline the input data requirements, supported operating systems and particular attributes of thirty-two amplicon processing pipelines with the goal of helping users to select a pipeline for their metabarcoding projects.
Keyphrases
  • data analysis
  • electronic health record
  • big data
  • circulating tumor
  • single molecule
  • high throughput sequencing
  • machine learning
  • climate change
  • quality improvement
  • circulating tumor cells