Comparison of kNN and k-means optimization methods of reference set selection for improved CNV callers performance.
Wiktor KuśmirekAgnieszka SzmurłoMarek WiewiórkaRobert NowakTomasz GambinPublished in: BMC bioinformatics (2019)
The performed experiments have shown that the appropriate selection of the reference sample set may greatly improve the CNV detection rate. In particular, we found that smart reduction of reference sample size may significantly increase the algorithms' precision while having negligible negative effect on sensitivity. We observed that a complete CNV calling process with the k-means algorithm as the selection method has significantly better time complexity than kNN-based solution.