Login / Signup

Improved sequence mapping using a complete reference genome and lift-over.

Nae-Chyun ChenLuis F PaulinFritz J SedlazeckSergey KorenAdam M PhillippyBen Langmead
Published in: Nature methods (2023)
Complete, telomere-to-telomere (T2T) genome assemblies promise improved analyses and the discovery of new variants, but many essential genomic resources remain associated with older reference genomes. Thus, there is a need to translate genomic features and read alignments between references. Here we describe a method called levioSAM2 that performs fast and accurate lift-over between assemblies using a whole-genome map. In addition to enabling the use of several references, we demonstrate that aligning reads to a high-quality reference (for example, T2T-CHM13) and lifting to an older reference (for example, Genome reference Consortium (GRC)h38) improves the accuracy of the resulting variant calls on the old reference. By leveraging the quality improvements of T2T-CHM13, levioSAM2 reduces small and structural variant calling errors compared with GRC-based mapping using real short- and long-read datasets. Performance is especially improved for a set of complex medically relevant genes, where the GRC references are lower quality.
Keyphrases
  • high resolution
  • genome wide
  • copy number
  • physical activity
  • small molecule
  • gene expression
  • emergency department
  • middle aged
  • mass spectrometry
  • high throughput
  • rna seq
  • transcription factor
  • single cell
  • amino acid