Login / Signup

Hybracter: Enabling Scalable, Automated, Complete and Accurate Bacterial Genome Assemblies.

George Spyro BourasGhais HoutakRyan R WickVijini G MallawaarachchiMichael J RoachBhavya PapudeshiLousie M JuddAnna E SheppardRobert A EdwardsSarah Vreugde
Published in: bioRxiv : the preprint server for biology (2023)
Improvements in the accuracy and availability of long-read sequencing mean that complete bacterial genomes are now routinely reconstructed using hybrid (i.e. short- and long-reads) assembly approaches. Complete genomes allow a deeper understanding of bacterial evolution and genomic variation beyond small nucleotide variants (SNVs). They are also crucial for identifying plasmids, which often carry medically significant antimicrobial resistance (AMR) genes. However, small plasmids are often missed or misassembled by long-read assembly algorithms. Here, we present Hybracter, method for fast, automatic and scalable recovery of near-perfect complete bacterial genomes using a long-read first assembly approach. We compared Hybracter to existing automated hybrid assembly tools using a diverse panel of samples with manually curated ground truth reference genomes. We demonstrate that Hybracter is more accurate and faster than the existing gold standard automated hybrid assembler Unicycler. We also show that Hybracter with long-reads only is comparable to hybrid methods in recovering small plasmids.
Keyphrases
  • deep learning
  • machine learning
  • antimicrobial resistance
  • escherichia coli
  • high throughput
  • genome wide
  • klebsiella pneumoniae
  • gene expression
  • single cell
  • dna methylation
  • silver nanoparticles
  • transcription factor