GO functional similarity clustering depends on similarity measure, clustering method, and annotation completeness.
Meng LiuPaul D ThomasPublished in: BMC bioinformatics (2019)
We assessed the effects of annotation completeness on the distribution of pairwise gene semantic similarity scores, and subsequent effects on the clusters derived from these scores. Our results suggest combinations of semantic similarity measures, gene-level scoring methods and clustering method that perform best for functional gene clustering using annotation sets of varying completeness. Overall, our results underscore the importance of increasing the completeness of GO annotations to for supporting computational analyses of gene function.