Shotgun metagenome data of a defined mock community using Oxford Nanopore, PacBio and Illumina technologies.
Volkan SevimJuna LeeRobert EganAlicia ClumHope HundleyJaney LeeR Craig EverroadAngela M DetweilerBrad M BeboutJennifer Pett-RidgeMarkus GökerAlison E MurrayStephen R LindemannHans-Peter KlenkRonan O'MalleyMatthew ZaneJan-Fang ChengAlex CopelandChristopher DaumEsther SingerTanja WoykePublished in: Scientific data (2019)
Metagenomic sequence data from defined mock communities is crucial for the assessment of sequencing platform performance and downstream analyses, including assembly, binning and taxonomic assignment. We report a comparison of shotgun metagenome sequencing and assembly metrics of a defined microbial mock community using the Oxford Nanopore Technologies (ONT) MinION, PacBio and Illumina sequencing platforms. Our synthetic microbial community BMock12 consists of 12 bacterial strains with genome sizes spanning 3.2-7.2 Mbp, 40-73% GC content, and 1.5-7.3% repeats. Size selection of both PacBio and ONT sequencing libraries prior to sequencing was essential to yield comparable relative abundances of organisms among all sequencing technologies. While the Illumina-based metagenome assembly yielded good coverage with few misassemblies, contiguity was greatly improved by both, Illumina + ONT and Illumina + PacBio hybrid assemblies but increased misassemblies, most notably in genomes with high sequence similarity to each other. Our resulting datasets allow evaluation and benchmarking of bioinformatics software on Illumina, PacBio and ONT platforms in parallel.
Keyphrases
- microbial community
- single cell
- high throughput sequencing
- healthcare
- mental health
- escherichia coli
- antibiotic resistance genes
- electronic health record
- single molecule
- high throughput
- dna methylation
- high resolution
- genome wide
- gene expression
- deep learning
- health insurance
- amino acid
- wastewater treatment
- tandem mass spectrometry