Single-Cell Clustering Based on Shared Nearest Neighbor and Graph Partitioning.
Xiaoshu ZhuJie ZhangYunpei XuJianxin WangXiaoqing PengHong-Dong LiPublished in: Interdisciplinary sciences, computational life sciences (2020)
Clustering of single-cell RNA sequencing (scRNA-seq) data enables discovering cell subtypes, which is helpful for understanding and analyzing the processes of diseases. Determining the weight of edges is an essential component in graph-based clustering methods. While several graph-based clustering algorithms for scRNA-seq data have been proposed, they are generally based on k-nearest neighbor (KNN) and shared nearest neighbor (SNN) without considering the structure information of graph. Here, to improve the clustering accuracy, we present a novel method for single-cell clustering, called structural shared nearest neighbor-Louvain (SSNN-Louvain), which integrates the structure information of graph and module detection. In SSNN-Louvain, based on the distance between a node and its shared nearest neighbors, the weight of edge is defined by introducing the ratio of the number of the shared nearest neighbors to that of nearest neighbors, thus integrating structure information of the graph. Then, a modified Louvain community detection algorithm is proposed and applied to identify modules in the graph. Essentially, each community represents a subtype of cells. It is worth mentioning that our proposed method integrates the advantages of both SNN graph and community detection without the need for tuning any additional parameter other than the number of neighbors. To test the performance of SSNN-Louvain, we compare it to five existing methods on 16 real datasets, including nonnegative matrix factorization, single-cell interpretation via multi-kernel learning, SNN-Cliq, Seurat and PhenoGraph. The experimental results show that our approach achieves the best average performance in these datasets.
Keyphrases
- single cell
- rna seq
- convolutional neural network
- neural network
- high throughput
- healthcare
- machine learning
- deep learning
- mental health
- body mass index
- weight loss
- health information
- loop mediated isothermal amplification
- electronic health record
- oxidative stress
- label free
- induced apoptosis
- stem cells
- social media
- cell proliferation
- artificial intelligence
- genome wide
- signaling pathway
- body weight
- cell cycle arrest
- network analysis
- endoplasmic reticulum stress
- data analysis
- sensitive detection