Login / Signup

NGS based haplotype assembly using matrix completion.

Sina MajidianMohammad Hossein Kahaei
Published in: PloS one (2019)
We apply matrix completion methods for haplotype assembly from NGS reads to develop the new HapSVT, HapNuc, and HapOPT algorithms. This is performed by applying a mathematical model to convert the reads to an incomplete matrix and estimating unknown components. This process is followed by quantizing and decoding the completed matrix in order to estimate haplotypes. These algorithms are compared to the state-of-the-art algorithms using simulated data as well as the real fosmid data. It is shown that the SNP missing rate and the haplotype block length of the proposed HapOPT are better than those of HapCUT2 with comparable accuracy in terms of reconstruction rate and switch error rate. A program implementing the proposed algorithms in MATLAB is freely available at https://github.com/smajidian/HapMC.
Keyphrases
  • machine learning
  • deep learning
  • big data
  • electronic health record
  • quality improvement
  • artificial intelligence
  • genome wide
  • gene expression
  • genetic diversity