Login / Signup

Clair3-trio: high-performance Nanopore long-read variant calling in family trios with trio-to-trio deep neural networks.

Junhao SuZhenxian ZhengSyed Shakeel AhmedTak-Wah LamRuibang Luo
Published in: Briefings in bioinformatics (2022)
Accurate identification of genetic variants from family child-mother-father trio sequencing data is important in genomics. However, state-of-the-art approaches treat variant calling from trios as three independent tasks, which limits their calling accuracy for Nanopore long-read sequencing data. For better trio variant calling, we introduce Clair3-Trio, the first variant caller tailored for family trio data from Nanopore long-reads. Clair3-Trio employs a Trio-to-Trio deep neural network model, which allows it to input the trio sequencing information and output all of the trio's predicted variants within a single model to improve variant calling. We also present MCVLoss, a novel loss function tailor-made for variant calling in trios, leveraging the explicit encoding of the Mendelian inheritance. Clair3-Trio showed comprehensive improvement in experiments. It predicted far fewer Mendelian inheritance violation variations than current state-of-the-art methods. We also demonstrated that our Trio-to-Trio model is more accurate than competing architectures. Clair3-Trio is accessible as a free, open-source project at https://github.com/HKU-BAL/Clair3-Trio.
Keyphrases
  • neural network
  • single molecule
  • healthcare
  • mental health
  • high resolution
  • mass spectrometry
  • big data
  • quality improvement
  • dna methylation
  • mitochondrial dna
  • deep learning
  • health information