Login / Signup

Accurate genotype imputation from low-coverage whole-genome sequencing data of rainbow trout.

Sixin LiuKyle E MartinWarren M SnellingRoseanna LongTimothy D LeedsRoger L VallejoGregory D WiensYniv Palti
Published in: G3 (Bethesda, Md.) (2024)
With the rapid and significant cost reduction of next-generation sequencing, low-coverage whole-genome sequencing (lcWGS) followed by genotype imputation is becoming a cost-effective alternative to SNP (single nucleotide polymorphism) array genotyping. The objectives of this study were two-fold: 1) construct a haplotype reference panel for genotype imputation from lcWGS data in rainbow trout (Oncorhynchus mykiss); and 2) evaluate the concordance between imputed genotypes and SNP-array genotypes in two breeding populations. Medium-coverage (12x) whole-genome sequences were obtained from a total of 410 fish representing five breeding populations with various spawning dates. The short-read sequences were mapped to the rainbow trout reference genome, and genetic variants were identified using GATK. After data filtering, 20,434,612 biallelic SNPs were retained. The reference panel was phased with SHAPEIT5, and was used as a reference to impute genotypes from lcWGS data using GLIMPSE2. A total of 90 fish from the Troutlodge November breeding population were sequenced with an average coverage of 1.3x, and these fish were also genotyped with the Axiom 57K rainbow trout SNP array. The concordance between array-based genotypes and imputed genotypes was 99.1%. After downsampling the coverage to 0.5x, 0.2x and 0.1x, the concordance between array-based genotypes and imputed genotypes was 98.7%, 97.8% and 96.7%, respectively. In the USDA odd-year breeding population, the concordance between array-based genotypes and imputed genotypes was 97.8% for 109 fish downsampled to 0.5x coverage. Therefore, the reference haplotype panel reported in this study can be used to accurately impute genotypes from lcWGS data in rainbow trout breeding populations.
Keyphrases