The design and application of a 50 K SNP chip for a threatened Aotearoa New Zealand passerine, the hihi.
Kate D LeeCraig D MillarPatricia BrekkeAnnabel WhibleyJohn G EwenMelanie HingstonAmy ZhuAnna W SanturePublished in: Molecular ecology resources (2021)
Next-generation sequencing has transformed the fields of ecological and evolutionary genetics by allowing for cost-effective identification of genome-wide variation. Single nucleotide polymorphism (SNP) arrays, or "SNP chips", enable very large numbers of individuals to be consistently genotyped at a selected set of these identified markers, and also offer the advantage of being able to analyse samples of variable DNA quality. We used reduced representation restriction-aided digest sequencing (RAD-seq) of 31 birds of the threatened hihi (Notiomystis cincta; stitchbird) and low-coverage whole genome sequencing (WGS) of 10 of these birds to develop an Affymetrix 50 K SNP chip. We overcame the limitations of having no hihi reference genome and a low quantity of sequence data by separate and pooled de novo assembly of each of the 10 WGS birds. Reads from all individuals were mapped back to these de novo assemblies to identify SNPs. A subset of RAD-seq and WGS SNPs were selected for inclusion on the chip, prioritising SNPs with the highest quality scores whose flanking sequence uniquely aligned to the zebra finch (Taeniopygia guttata) genome. Of the 58,466 SNPs manufactured on the chip, 72% passed filtering metrics and were polymorphic. By genotyping 1,536 hihi on the array, we found that SNPs detected in multiple assemblies were more likely to successfully genotype, representing a cost-effective approach to identify SNPs for genotyping. Here, we demonstrate the utility of the SNP chip by describing the high rates of linkage disequilibrium in the hihi genome, reflecting the history of population bottlenecks in the species.
Keyphrases
- genome wide
- high throughput
- dna methylation
- copy number
- circulating tumor cells
- circulating tumor
- dna damage
- single cell
- dna repair
- healthcare
- oxidative stress
- climate change
- quality improvement
- rna seq
- hepatitis c virus
- risk assessment
- clinical trial
- cell free
- human immunodeficiency virus
- genome wide association
- health insurance
- data analysis
- affordable care act
- nucleic acid