Self-supervised contrastive learning for integrative single cell RNA-seq data analysis.
Wenkai HanYuqi ChengJiayang ChenHuawen ZhongZhihang HuSiyuan ChenLicheng ZongLiang HongTing-Fung ChanIrwin KingXin GaoYu LiPublished in: Briefings in bioinformatics (2022)
We present a novel self-supervised Contrastive LEArning framework for single-cell ribonucleic acid (RNA)-sequencing (CLEAR) data representation and the downstream analysis. Compared with current methods, CLEAR overcomes the heterogeneity of the experimental data with a specifically designed representation learning task and thus can handle batch effects and dropout events simultaneously. It achieves superior performance on a broad range of fundamental tasks, including clustering, visualization, dropout correction, batch effect removal, and pseudo-time inference. The proposed method successfully identifies and illustrates inflammatory-related mechanisms in a COVID-19 disease study with 43 695 single cells from peripheral blood mononuclear cells.