A deep learning-based method enables the automatic and accurate assembly of chromosome-level genomes.
Zijie JiangZhixiang PengZhaoyuan WeiJiahe SunYongjiang LuoLingzi BieGuoqing ZhangYi WangPublished in: Nucleic acids research (2024)
The application of high-throughput chromosome conformation capture (Hi-C) technology enables the construction of chromosome-level assemblies. However, the correction of errors and the anchoring of sequences to chromosomes in the assembly remain significant challenges. In this study, we developed a deep learning-based method, AutoHiC, to address the challenges in chromosome-level genome assembly by enhancing contiguity and accuracy. Conventional Hi-C-aided scaffolding often requires manual refinement, but AutoHiC instead utilizes Hi-C data for automated workflows and iterative error correction. When trained on data from 300+ species, AutoHiC demonstrated a robust average error detection accuracy exceeding 90%. The benchmarking results confirmed its significant impact on genome contiguity and error correction. The innovative approach and comprehensive results of AutoHiC constitute a breakthrough in automated error detection, promising more accurate genome assemblies for advancing genomics research.
Keyphrases
- deep learning
- high throughput
- copy number
- artificial intelligence
- machine learning
- convolutional neural network
- big data
- genome wide
- electronic health record
- single cell
- loop mediated isothermal amplification
- label free
- resistance training
- gene expression
- magnetic resonance
- data analysis
- molecular dynamics simulations
- neural network