polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids.
Lindsay V ClarkAlexander E LipkaErik J SacksPublished in: G3 (Bethesda, Md.) (2019)
Low or uneven read depth is a common limitation of genotyping-by-sequencing (GBS) and restriction site-associated DNA sequencing (RAD-seq), resulting in high missing data rates, heterozygotes miscalled as homozygotes, and uncertainty of allele copy number in heterozygous polyploids. Bayesian genotype calling can mitigate these issues, but previously has only been implemented in software that requires a reference genome or uses priors that may be inappropriate for the population. Here we present several novel Bayesian algorithms that estimate genotype posterior probabilities, all of which are implemented in a new R package, polyRAD. Appropriate priors can be specified for mapping populations, populations in Hardy-Weinberg equilibrium, or structured populations, and in each case can be informed by genotypes at linked markers. The polyRAD software imports read depth from several existing pipelines, and outputs continuous or discrete numerical genotypes suitable for analyses such as genome-wide association and genomic prediction.
Keyphrases
- copy number
- genome wide
- single cell
- mitochondrial dna
- single molecule
- genome wide association
- dna methylation
- genetic diversity
- electronic health record
- data analysis
- rna seq
- optical coherence tomography
- big data
- machine learning
- high throughput
- high resolution
- early onset
- molecular dynamics
- dna repair
- deep learning
- cell free
- gene expression
- circulating tumor
- molecular dynamics simulations
- oxidative stress
- mass spectrometry
- circulating tumor cells