Login / Signup

Highly accurate fluorogenic DNA sequencing with information theory-based error correction.

Zitian ChenWenxiong ZhouShuo QiaoLi KangHaifeng DuanX Sunney XieYanyi Huang
Published in: Nature biotechnology (2017)
Eliminating errors in next-generation DNA sequencing has proved challenging. Here we present error-correction code (ECC) sequencing, a method to greatly improve sequencing accuracy by combining fluorogenic sequencing-by-synthesis (SBS) with an information theory-based error-correction algorithm. ECC embeds redundancy in sequencing reads by creating three orthogonal degenerate sequences, generated by alternate dual-base reactions. This is similar to encoding and decoding strategies that have proved effective in detecting and correcting errors in information communication and storage. We show that, when combined with a fluorogenic SBS chemistry with raw accuracy of 98.1%, ECC sequencing provides single-end, error-free sequences up to 200 bp. ECC approaches should enable accurate identification of extremely rare genomic variations in various applications in biology and medicine.
Keyphrases
  • single cell
  • machine learning
  • high resolution
  • single molecule
  • emergency department
  • deep learning
  • copy number
  • quality improvement
  • electronic health record
  • genome wide
  • drug discovery