fastHaN: a fast and scalable program for constructing haplotype network for large-sample sequences.
Lianjiang ChiXiaolong ZhangYongbiao XueHua ChenPublished in: Molecular ecology resources (2023)
Haplotype networks can be used to demonstrate the genealogical relationships of DNA sequences within species, and thus are widely applied in population genetics, molecular ecology, epidemiology and evolutionary studies. However, existing programs become computationally infeasible as the sample size increases. Here, we present fastHaN, an efficient and scalable program suitable for constructing haplotype networks for large samples. On a data set with the haplotype length of 30 kb, the Median Joining Network (MJN) algorithm implemented by fastHaN is 3000 times faster than PopART and 70 times faster than NETWORK in single-threaded mode. The implementation of the Templeton-Crandall-Sing (TCS) algorithm is 100 times faster than PopART and 5800 times faster than the TCS software. Moreover, fastHaN also enables multi-threaded mode with scalability. The source code is freely available on https://github.com/ChenHuaLab/fastHaN/. A web-based version is also available on https://ngdc.cncb.ac.cn/haplotype/.