skelesim: an extensible, general framework for population genetic simulation in R.
Christian M ParobekFrederick I ArcherMichelle E DePrenger-LevinSean M HobanLibby LigginsAllan E StrandPublished in: Molecular ecology resources (2016)
Simulations are a key tool in molecular ecology for inference and forecasting, as well as for evaluating new methods. Due to growing computational power and a diversity of software with different capabilities, simulations are becoming increasingly powerful and useful. However, the widespread use of simulations by geneticists and ecologists is hindered by difficulties in understanding these softwares' complex capabilities, composing code and input files, a daunting bioinformatics barrier and a steep conceptual learning curve. skelesim (an R package) guides users in choosing appropriate simulations, setting parameters, calculating genetic summary statistics and organizing data output, in a reproducible pipeline within the R environment. skelesim is designed to be an extensible framework that can 'wrap' around any simulation software (inside or outside the R environment) and be extended to calculate and graph any genetic summary statistics. Currently, skelesim implements coalescent and forward-time models available in the fastsimcoal2 and rmetasim simulation engines to produce null distributions for multiple population genetic statistics and marker types, under a variety of demographic conditions. skelesim is intended to make simulations easier while still allowing full model complexity to ensure that simulations play a fundamental role in molecular ecology investigations. skelesim can also serve as a teaching tool: demonstrating the outcomes of stochastic population genetic processes; teaching general concepts of simulations; and providing an introduction to the R environment with a user-friendly graphical user interface (using shiny).