Login / Signup

An efficient and tunable parameter to improve variant calling for whole genome and exome sequencing data.

Yong Ju AhnKesavan MarkkandanIn-Pyo BaekSeyoung MunWooseok LeeHeui-Soo KimKyudong Han
Published in: Genes & genomics (2017)
Next generation sequencing (NGS) has traditionally been performed in various fields including agricultural to clinical and there are so many sequencing platforms available in order to obtain accurate and consistent results. However, these platforms showed amplification bias when facilitating variant calls in personal genomes. Here, we sequenced whole genomes and whole exomes from ten Korean individuals using Illumina and Ion Proton, respectively to find the vulnerability and accuracy of NGS platform in the GC rich/poor area. Overall, a total of 1013 Gb reads from Illumina and ~39.1 Gb reads from Ion Proton were analyzed using BWA-GATK variant calling pipeline. Furthermore, conjunction with the VQSR tool and detailed filtering strategies, we achieved high-quality variants. Finally, each of the ten variants from Illumina only, Ion Proton only, and intersection was selected for Sanger validation. The validation results revealed that Illumina platform showed higher accuracy than Ion Proton. The described filtering methods are advantageous for large population-based whole genome studies designed to identify common and rare variations associated with complex diseases.
Keyphrases
  • copy number
  • single cell
  • high throughput sequencing
  • risk assessment
  • high resolution
  • gene expression
  • heavy metals
  • electronic health record
  • big data
  • genome wide
  • deep learning
  • circulating tumor cells
  • quantum dots