A novel similarity score based on gene ranks to reveal genetic relationships among diseases.
Dongmei LuoChengdong ZhangLiwan FuYuening ZhangYue-Qing HuPublished in: PeerJ (2021)
Knowledge of similarities among diseases can contribute to uncovering common genetic mechanisms. Based on ranked gene lists, a couple of similarity measures were proposed in the literature. Notice that they may suffer from the determination of cutoff or heavy computational load, we propose a novel similarity score SimSIP among diseases based on gene ranks. Simulation studies under various scenarios demonstrate that SimSIP has better performance than existing rank-based similarity measures. Application of SimSIP in gene expression data of 18 cancer types from The Cancer Genome Atlas shows that SimSIP is superior in clarifying the genetic relationships among diseases and demonstrates the tendency to cluster the histologically or anatomically related cancers together, which is analogous to the pan-cancer studies. Moreover, SimSIP with simpler form and faster computation is more robust for higher levels of noise than existing methods and provides a basis for future studies on genetic relationships among diseases. In addition, a measure MAG is developed to gauge the magnitude of association of anindividual gene with diseases. By using MAG the genes and biological processes significantly associated with colorectal cancer are detected.
Keyphrases