VC@Scale: Scalable and high-performance variant calling on cluster environments.
Tanveer AhmadZaid Al ArsH Peter HofsteePublished in: GigaScience (2022)
We show the feasibility and easy scalability of our approach to achieve high performance and efficient resource utilization for variant-calling analysis on high-performance computing clusters using the standardized Apache Arrow data representations. All codes, scripts, and configurations used to run our implementations are publicly available and open sourced; see https://github.com/abs-tudelft/variant-calling-at-scale.