Utility of pooled sequencing for association mapping in nonmodel organisms.
Steven J MichelettiShawn R NarumPublished in: Molecular ecology resources (2018)
High-density genome-wide sequencing increases the likelihood of discovering genes of major effect and genomic structural variation in organisms. While there is an increasing availability of reference genomes across broad taxa, the greatest limitation to whole-genome sequencing of multiple individuals continues to be the costs associated with sequencing. To alleviate excessive costs, pooling multiple individuals with similar phenotypes and sequencing the homogenized DNA (Pool-Seq) can achieve high genome coverage, but at the loss of individual genotypes. Although Pool-Seq has been an effective method for association mapping in model organisms, it has not been frequently utilized in natural populations. To extend bioinformatic tools for rapid implementation of Pool-Seq data in nonmodel organisms, we developed a pipeline called PoolParty and illustrate its effectiveness in genetic association mapping. Alignment expectations based on five pooled Chinook salmon (Oncorhynchus tshawytscha) libraries showed that approximately 48% genome coverage per library could be achieved with reasonable sequencing effort. We additionally examined male and female O. tshawytscha libraries to illustrate how Pool-Seq techniques can successfully map known genes associated with functional differences among sexes such as growth hormone 2. Finally, we compared pools of individuals of different spawning ages for each sex to discover novel genes involved with age at maturity in O. tshawytscha such as opsin4 and transmembrane protein19. While not appropriate for every system, Pool-Seq data processed by the PoolParty pipeline is a practical method for identifying genes of major effect in nonmodel organisms when high genome coverage is necessary and cost is a limiting factor.
Keyphrases
- genome wide
- high density
- single cell
- dna methylation
- copy number
- gram negative
- rna seq
- high resolution
- growth hormone
- affordable care act
- randomized controlled trial
- primary care
- healthcare
- electronic health record
- clinical trial
- gene expression
- body mass index
- big data
- quality improvement
- deep learning
- transcription factor
- protein protein
- study protocol
- data analysis
- binding protein
- sensitive detection
- cell free
- small molecule