CHIIMP: An automated high-throughput microsatellite genotyping platform reveals greater allelic diversity in wild chimpanzees.
Hannah J BarbianAndrew Jesse ConnellAlexa N AvittoRonnie M RussellAndrew G SmithMadhurima S GundlapallyAlexander L ShazadYingying LiFrederic Bibollet-RucheEmily E WroblewskiDeus MjunguElizabeth V LonsdorfFiona A StewartAlexander K PielAnne E PuseyPaul M SharpBeatrice H HahnPublished in: Ecology and evolution (2018)
Short tandem repeats (STRs), also known as microsatellites, are commonly used to noninvasively genotype wild-living endangered species, including African apes. Until recently, capillary electrophoresis has been the method of choice to determine the length of polymorphic STR loci. However, this technique is labor intensive, difficult to compare across platforms, and notoriously imprecise. Here we developed a MiSeq-based approach and tested its performance using previously genotyped fecal samples from long-term studied chimpanzees in Gombe National Park, Tanzania. Using data from eight microsatellite loci as a reference, we designed a bioinformatics platform that converts raw MiSeq reads into locus-specific files and automatically calls alleles after filtering stutter sequences and other PCR artifacts. Applying this method to the entire Gombe population, we confirmed previously reported genotypes, but also identified 31 new alleles that had been missed due to sequence differences and size homoplasy. The new genotypes, which increased the allelic diversity and heterozygosity in Gombe by 61% and 8%, respectively, were validated by replicate amplification and pedigree analyses. This demonstrated inheritance and resolved one case of an ambiguous paternity. Using both singleplex and multiplex locus amplification, we also genotyped fecal samples from chimpanzees in the Greater Mahale Ecosystem in Tanzania, demonstrating the utility of the MiSeq-based approach for genotyping nonhabituated populations and performing comparative analyses across field sites. The new automated high-throughput analysis platform (available at https://github.com/ShawHahnLab/chiimp) will allow biologists to more accurately and effectively determine wildlife population size and structure, and thus obtain information critical for conservation efforts.
Keyphrases
- high throughput
- genetic diversity
- genome wide association study
- capillary electrophoresis
- genome wide
- single cell
- nucleic acid
- quality improvement
- mass spectrometry
- climate change
- machine learning
- mitochondrial dna
- gene expression
- dna methylation
- healthcare
- magnetic resonance
- big data
- risk assessment
- health information
- computed tomography
- deep learning
- magnetic resonance imaging
- contrast enhanced