Debar: A sequence-by-sequence denoiser for COI-5P DNA barcode data.
Cameron M NugentTyler A ElliottSujeevan RatnasinghamPaul D N HebertSarah J AdamowiczPublished in: Molecular ecology resources (2021)
DNA barcoding and metabarcoding are now widely used to advance species discovery and biodiversity assessments. High-throughput sequencing (HTS) has expanded the volume and scope of these analyses, but elevated error rates introduce noise into sequence records that can inflate estimates of biodiversity. Denoising -the separation of biological signal from instrument (technical) noise-of barcode and metabarcode data currently employs abundance-based methods which do not capitalize on the highly conserved structure of the cytochrome c oxidase subunit I (COI) region employed as the animal barcode. This manuscript introduces debar, an R package that utilizes a profile hidden Markov model to denoise indel errors in COI sequences introduced by instrument error. In silico studies demonstrated that debar recognized 95% of artificially introduced indels in COI sequences. When applied to real-world data, debar reduced indel errors in circular consensus sequences obtained with the Sequel platform by 75%, and those generated on the Ion Torrent S5 by 94%. The false correction rate was less than 0.1%, indicating that debar is receptive to the majority of true COI variation in the animal kingdom. In conclusion, the debar package improves DNA barcode and metabarcode workflows by aiding the generation of more accurate sequences aiding the characterization of species diversity.
Keyphrases
- circulating tumor
- electronic health record
- cell free
- single molecule
- genetic diversity
- big data
- air pollution
- high throughput sequencing
- patient safety
- small molecule
- patient reported outcomes
- high resolution
- nucleic acid
- transcription factor
- convolutional neural network
- circulating tumor cells
- molecular docking
- mass spectrometry
- antibiotic resistance genes
- quality improvement
- deep learning
- protein kinase
- drug induced