Login / Signup

CompoundHetVIP: Compound Heterozygous Variant Identification Pipeline.

Dustin B MillerStephen R Piccolo
Published in: F1000Research (2020)
A compound heterozygous ( CH) variant occurs when a person inherits two alternate alleles, one from each parent, and these alleles occur at different positions within the same gene. Therefore, CH variant identification requires distinguishing maternally from paternally derived nucleotides, a process that requires numerous computational tools. Using such tools can be challenging and often introduce unforeseen challenges such as installation procedures that are operating-system specific, software dependencies, and format requirements for input files. To overcome these challenges, we developed Compound Heterozygous Variant Identification Pipeline (CompoundHetVIP), which uses a single Docker image to encapsulate commonly used software tools for phasing, annotating, and analyzing CH, homozygous alternate, and de novo variants in a series of 13 steps. To begin using our tool, researchers need only install the Docker engine and download the CompoundHetVIP Docker image. The tools provided in CompoundHetVIP can be applied to Illumina whole-genome sequencing data of individual samples or trios (a child and both parents), using VCF or gVCF files as initial input. Each step of the pipeline produces an analysis-ready output file that can be further evaluated. To illustrate its use, we applied CompoundHetVIP to data from a publicly available Ashkenazim trio and identified two genes with candidate CH variants and one gene with a candidate homozygous alternate variant after filtering. While this example uses genomic data from a healthy child, we anticipate that most researchers will use CompoundHetVIP to uncover missing heritability in human diseases and other phenotypes. CompoundHetVIP is open-source software and can be found at https://github.com/dmiller903/CompoundHetVIP; this repository also provides detailed, step-by-step examples.
Keyphrases