Benchmarking and Optimization of Methods for the Detection of Identity-By-Descent in Plasmodium falciparum .
Bing GuoShannon Takala-HarrisonTimothy D O'ConnorPublished in: bioRxiv : the preprint server for biology (2024)
Genomic surveillance is crucial for identifying at-risk populations for targeted malaria control and elimination. Identity-by-descent (IBD) is being used in Plasmodium population genomics to estimate genetic relatedness, effective population size ( N e ), population structure, and positive selection. However, a comprehensive evaluation of IBD segment detection tools is lacking for species with high rates of recombination. Here, we employ genetic simulations reflecting P. falciparum 's high recombination rate and decreasing N e to benchmark IBD callers, including probabilistic (hmmIBD, isoRelate), identity-by-state-based (hap-IBD, phased IBD) and others (Refined IBD), using genealogy-based true IBD and downstream inference of population characteristics. Our findings reveal that low marker density per genetic unit, related to high recombination rates relative to mutation rates, significantly affects the quality of detected IBD segments. Most IBD callers suffer from high false negative rates, which can be improved with parameter optimization. Optimized parameters allow for more accurate capture of selection signals and population structure, but hmmIBD is unique in providing less biased estimates of N e . Empirical data subsampled from the MalariaGEN Pf 7 database, representing different transmission settings, confirmed these patterns. We conclude that the detection of IBD in high-recombining species requires context-specific evaluation and parameter optimization and recommend that hmmIBD be used for quality-sensitive analysis, such as estimation of N e in these species.