Login / Signup

Phasing and imputation of single nucleotide polymorphism data of missing parents of biparental plant populations.

Serap GonenValentin WimmerR Chris GaynorEd ByrneGregor GorjancJohn M Hickey
Published in: Crop science (2021)
This paper presents an extension to a heuristic method for phasing and imputation of genotypes of descendants in biparental populations so that it can phase and impute genotypes of parents that are ungenotyped or partially genotyped. The imputed genotypes of the parent are used to impute low-density (Ld) genotyped descendants to high density (Hd). The extension was implemented as part of the AlphaPlantImpute software and works in three steps. First, it identifies whether a parent has no or Ld genotypes and identifies its relatives that have Hd genotypes. Second, using the Hd genotypes of relatives, it determines whether the parent is homozygous or heterozygous for a given locus. Third, it phases heterozygous positions of the parent by matching haplotypes to its relatives. We measured the accuracy (correlation between true and imputed genotypes) of imputing parent genotypes in simulated biparental populations from different scenarios. We tested the imputation accuracy of the missing parent's descendants using the true genotype of the parent and compared this with using the imputed genotypes of the parent. Across all scenarios, the imputation accuracy of a parent was >0.98 and did not drop below ∼0.96. The imputation accuracy of a parent was always higher when it was inbred than outbred. Including ancestors of the parent at Hd, increasing the number of crosses and the number of Hd descendants increased the imputation accuracy. The high imputation accuracy achieved for the parent translated to little or no impact on the imputation accuracy of its descendants.
Keyphrases
  • climate change
  • gene expression
  • genome wide
  • early onset
  • machine learning
  • big data
  • deep learning
  • electronic health record