Structural Refinement by Direct Mapping Reveals Assembly Inconsistencies near Hi-C Junctions.
Luca MarcolungoLeonardo VincenziMatteo BallottariMichela CecchinEmanuela CosentinoThomas MignaniAntonina LimongiIrene FerrarisMatteo OrlandiMarzia RossatoMassimo DelledonnePublished in: Plants (Basel, Switzerland) (2023)
High-throughput chromosome conformation capture (Hi-C) is widely used for scaffolding in de novo assembly because it produces highly contiguous genomes, but its indirect statistical approach can introduce connection errors. We employed optical mapping (Bionano Genomics) as an orthogonal scaffolding technology to assess the structural solidity of Hi-C reconstructed scaffolds. Optical maps were used to assess the correctness of five de novo genome assemblies based on long-read sequencing for contig generation and Hi-C for scaffolding. Hundreds of inconsistencies were found between the reconstructions generated using the Hi-C and optical mapping approaches. Manual inspection, exploiting raw long-read sequencing data and optical maps, confirmed that several of these conflicts were derived from Hi-C joining errors. Such misjoins were widespread, involved the connection of both small and large contigs, and even overlapped annotated genes. We conclude that the integration of optical mapping data after, not before, Hi-C-based scaffolding, improves the quality of the assembly and limits reconstruction errors by highlighting misjoins that can then be subjected to further investigation.
Keyphrases
- high resolution
- high speed
- single cell
- high throughput
- high density
- single molecule
- electronic health record
- patient safety
- adverse drug
- genome wide
- emergency department
- machine learning
- molecular dynamics simulations
- magnetic resonance
- quality improvement
- oxidative stress
- data analysis
- artificial intelligence
- computed tomography
- dual energy