Login / Signup

Recommendations for performance optimizations when using GATK3.8 and GATK4.

Jacob R HeldenbrandSaurabh BahetiMatthew A BockolTravis M DruckerSteven N HartMatthew E HudsonRavishankar K IyerMichael T KalmbachKatherine I KendigEric W KleeNathan R MattsonEric D WiebenMathieu WiepertDerek E WildmanLiudmila Sergeevna Mainzer
Published in: BMC bioinformatics (2019)
In time-sensitive situations, when a patient has a critical or rapidly developing condition, it is useful to minimize the time to process a single sample. In such cases we recommend using GATK3.8 by splitting the sample into chunks and computing across multiple nodes. The resultant walltime will be nnn.4 hours at the cost of $41.60 on 4 c5.18xlarge instances of Amazon Cloud. For cost-effectiveness of routine analyses or for large population studies, it is useful to maximize the number of samples processed per unit time. Thus we recommend GATK4, running multiple samples on one node. The total walltime will be ∼34.1 hours on 40 samples, with 1.18 samples processed per hour at the cost of $2.60 per sample on c5.18xlarge instance of Amazon Cloud.
Keyphrases
  • clinical practice
  • blood pressure
  • lymph node
  • case report
  • sentinel lymph node