Login / Signup

Genome assembly in the telomere-to-telomere era.

Heng LiRichard Durbin
Published in: Nature reviews. Genetics (2024)
Genome sequences largely determine the biology and encode the history of an organism, and de novo assembly - the process of reconstructing the genome sequence of an organism from sequencing reads - has been a central problem in bioinformatics for four decades. Until recently, genomes were typically assembled into fragments of a few megabases at best, but now technological advances in long-read sequencing enable the near-complete assembly of each chromosome - also known as telomere-to-telomere assembly - for many organisms. Here, we review recent progress on assembly algorithms and protocols, with a focus on how to derive near-telomere-to-telomere assemblies. We also discuss the additional developments that will be required to resolve remaining assembly gaps and to assemble non-diploid genomes.
Keyphrases
  • machine learning
  • genome wide
  • deep learning
  • dna methylation
  • single molecule