Using Mendelian inheritance to improve high-throughput SNP discovery.
Nancy ChenCristopher V Van HoutSrikanth GottipatiAndrew G ClarkPublished in: Genetics (2014)
Restriction site-associated DNA sequencing or genotyping-by-sequencing (GBS) approaches allow for rapid and cost-effective discovery and genotyping of thousands of single-nucleotide polymorphisms (SNPs) in multiple individuals. However, rigorous quality control practices are needed to avoid high levels of error and bias with these reduced representation methods. We developed a formal statistical framework for filtering spurious loci, using Mendelian inheritance patterns in nuclear families, that accommodates variable-quality genotype calls and missing data--both rampant issues with GBS data--and for identifying sex-linked SNPs. Simulations predict excellent performance of both the Mendelian filter and the sex-linkage assignment under a variety of conditions. We further evaluate our method by applying it to real GBS data and validating a subset of high-quality SNPs. These results demonstrate that our metric of Mendelian inheritance is a powerful quality filter for GBS loci that is complementary to standard coverage and Hardy-Weinberg filters. The described method, implemented in the software MendelChecker, will improve quality control during SNP discovery in nonmodel as well as model organisms.
Keyphrases
- genome wide
- quality control
- high throughput
- dna methylation
- copy number
- mitochondrial dna
- small molecule
- single cell
- electronic health record
- big data
- primary care
- healthcare
- genome wide association
- gene expression
- data analysis
- molecular dynamics
- circulating tumor
- multidrug resistant
- single molecule
- quality improvement
- machine learning
- human immunodeficiency virus
- quantum dots