Comparison of sequencing data processing pipelines and application to underrepresented African human populations.
Gwenna BretonAnna C V JohanssonPer SjödinCarina M SchlebuschMattias JakobssonPublished in: BMC bioinformatics (2021)
We conclude that applying the GATK "Best Practices" pipeline, including their recommended reference datasets, to underrepresented populations does not lead to a decrease in the number of called variants compared to alternative pipelines. We recommend to aim for coverage of > 30X if identifying most variants is important, and to work with large sample sizes at the variant calling stage, also for underrepresented individuals and populations.