Login / Signup

GreenHill: a de novo chromosome-level scaffolding and phasing tool using Hi-C.

Shun OuchiRei KajitaniTakehiko Itoh
Published in: Genome biology (2023)
Chromosome-level haplotype-resolved genome assembly is an important resource in molecular biology. However, current de novo haplotype assemblers require parental data or reference genomes and often fail to provide chromosome-level results. We present GreenHill, a novel scaffolding and phasing tool that considers various assemblers' contigs as input to reconstruct chromosome-level haplotypes using Hi-C without parental or reference data. Its unique functions include new error correction based on Hi-C contacts and the simultaneous use of Hi-C and long reads. Benchmarks reveal that GreenHill outperforms other approaches in contiguity and phasing accuracy, and the majority of chromosome arms are entirely phased.
Keyphrases
  • copy number
  • electronic health record
  • genome wide
  • gene expression
  • machine learning
  • single molecule