Login / Signup

rworkflows: automating reproducible practices for the R community.

Brian M SchilderAlan E MurphyNathan G Skene
Published in: Nature communications (2024)
Despite calls to improve reproducibility in research, achieving this goal remains elusive even within computational fields. Currently, >50% of R packages are distributed exclusively through GitHub. While the trend towards sharing open-source software has been revolutionary, GitHub does not have any default built-in checks for minimal coding standards or software usability. This makes it difficult to assess the current quality R packages, or to consistently use them over time and across platforms. While GitHub-native solutions are technically possible, they require considerable time and expertise for each developer to write, implement, and maintain. To address this, we develop rworkflows; a suite of tools to make robust continuous integration and deployment ( https://github.com/neurogenomics/rworkflows ). rworkflows can be implemented by developers of all skill levels using a one-time R function call which has both sensible defaults and extensive options for customisation. Once implemented, any updates to the GitHub repository automatically trigger parallel workflows that install all software dependencies, run code checks, generate a dedicated documentation website, and deploy a publicly accessible containerised environment. By making the rworkflows suite free, automated, and simple to use, we aim to promote widespread adoption of reproducible practices across a continually growing R community.
Keyphrases
  • healthcare
  • electronic health record
  • primary care
  • mental health
  • health information
  • data analysis
  • machine learning
  • deep learning
  • social media
  • high throughput
  • resting state
  • advance care planning