Using genotyping-by-sequencing to predict gender in animals.
Timothy P BiltonA J ChappellS M ClarkeR BrauningK G DoddsJ C McEwanS J RowePublished in: Animal genetics (2019)
Gender assignment errors are common in some animal species and lead to inaccuracies in downstream analyses. Procedures for detecting gender misassignment are available for array-based SNP data but are still being developed for genotyping-by-sequencing (GBS) data. In this study, we describe a method for using GBS data to predict gender using X and Y chromosomal SNPs. From a set of 1286 X chromosomal and 23 Y chromosomal deer (Cervus sp.) SNPs discovered from GBS sequence reads, a prediction model was built using a training dataset of 422 Red deer and validated using a test dataset of 868 Red deer and Wapiti deer. Prediction was based on the proportion of heterozygous genotypes on the X chromosome and the proportion of non-missing genotypes on the Y chromosome observed in each individual. The concordance between recorded gender and predicted gender was 98.6% in the training dataset and 99.3% in the test dataset. The model identified five individuals across both datasets with incorrect recorded gender and was unable to predict gender for another five individuals. Overall, our method predicted gender with a high degree of accuracy and could be used for quality control in gender assignment datasets or for assigning gender when unrecorded, provided a suitable reference genome is available.