Login / Signup

parallelnewhybrid: an R package for the parallelization of hybrid detection using newhybrids.

Brendan F WringeRyan R E StanleyNicholas W JefferyEric C AndersonIan R Bradbury
Published in: Molecular ecology resources (2016)
Hybridization among populations and species is a central theme in many areas of biology, and the study of hybridization has direct applicability to testing hypotheses about evolution, speciation and genetic recombination, as well as having conservation, legal and regulatory implications. Yet, despite being a topic of considerable interest, the identification of hybrid individuals, and quantification of the (un)certainty surrounding the identifications, remains difficult. Unlike other programs that exist to identify hybrids based on genotypic information, newhybrids is able to assign individuals to specific hybrid classes (e.g. F1 , F2 ) because it makes use of patterns of gene inheritance within each locus, rather than just the proportions of gene inheritance within each individual. For each comparison and set of markers, multiple independent runs of each data set should be used to develop an estimate of the hybrid class assignment accuracy. The necessity of analysing multiple simulated data sets, constructed from large genomewide data sets, presents significant computational challenges. To address these challenges, we present parallelnewhybrid, an r package designed to decrease user burden when undertaking multiple newhybrids analyses. parallelnewhybrid does so by taking advantage of the parallel computational capabilities inherent in modern computers to efficiently and automatically execute separate newhybrids runs in parallel. We show that parallelization of analyses using this package affords users several-fold reductions in time over a traditional serial analysis. parallelnewhybrid consists of an example data set, a readme and three operating system-specific functions to execute parallel newhybrids analyses on each of a computer's c cores. parallelnewhybrid is freely available on the long-term software hosting site github (www.github.com/bwringe/parallelnewhybrid).
Keyphrases
  • electronic health record
  • big data
  • genome wide
  • copy number
  • data analysis
  • public health
  • single molecule
  • healthcare
  • dna damage
  • oxidative stress
  • social media
  • health information
  • genome wide association study