Long-read metagenomics of multiple displacement amplified DNA of low-biomass human gut phageomes by SACRA pre-processing chimeric reads.
Yuya KiguchiSuguru NishijimaNaveen KumarMasahira HattoriWataru SudaPublished in: DNA research : an international journal for rapid publication of reports on genes and genomes (2021)
The human gut bacteriophage community (phageome) plays an important role in the host's health and disease; however, the entire structure is poorly understood, partly owing to the generation of many incomplete genomes in conventional short-read metagenomics. Here, we show long-read metagenomics of amplified DNA of low-biomass phageomes with multiple displacement amplification (MDA), involving the development of a novel bioinformatics tool, split amplified chimeric read algorithm (SACRA), that efficiently pre-processed numerous chimeric reads generated through MDA. Using five samples, SACRA markedly reduced the average chimera ratio from 72% to 1.5% in PacBio reads with an average length of 1.8 kb. De novo assembly of chimera-less PacBio long reads reconstructed contigs of ≥5 kb with an average proportion of 27%, which was 1% in contigs from MiSeq short reads, thereby dramatically improving contig length and genome completeness. Comparison of PacBio and MiSeq contigs found MiSeq contig fragmentations frequently near local repeats and hypervariable regions in the phage genomes, and those caused by multiple homologous phage genomes coexisting in the community. We also developed a reference-independent method to assess the completeness of the linear phage genomes. Overall, we established a SACRA-coupled long-read metagenomics robust to highly diverse gut phageomes, identifying high-quality circular and linear phage genomes with adequate sequence quantity.
Keyphrases
- single molecule
- pseudomonas aeruginosa
- cell therapy
- endothelial cells
- healthcare
- mental health
- circulating tumor
- induced pluripotent stem cells
- wastewater treatment
- public health
- nucleic acid
- cell free
- pluripotent stem cells
- stem cells
- cystic fibrosis
- dna damage
- dna methylation
- neural network
- deep learning
- anaerobic digestion
- risk assessment
- dna repair
- amino acid
- cell death
- signaling pathway
- cell cycle arrest