Login / Signup

A comparison of scRNA-seq annotation methods based on experimentally labeled immune cell subtype dataset.

Qiqing FuChenyu DongYunhe LiuXiaoqiong XiaGang LiuFan ZhongLei Liu
Published in: Briefings in bioinformatics (2024)
Cell-type annotation is a critical step in single-cell data analysis. With the development of numerous cell annotation methods, it is necessary to evaluate these methods to help researchers use them effectively. Reference datasets are essential for evaluation, but currently, the cell labels of reference datasets mainly come from computational methods, which may have computational biases and may not reflect the actual cell-type outcomes. This study first constructed an experimentally labeled immune cell-subtype single-cell dataset of the same batch and systematically evaluated 18 cell annotation methods. We assessed those methods under five scenarios, including intra-dataset validation, immune cell-subtype validation, unsupervised clustering, inter-dataset annotation, and unknown cell-type prediction. Accuracy and ARI were evaluation metrics. The results showed that SVM, scBERT, and scDeepSort were the best-performing supervised methods. Seurat was the best-performing unsupervised clustering method, but it couldn't fully fit the actual cell-type distribution. Our results indicated that experimentally labeled immune cell-subtype datasets revealed the deficiencies of unsupervised clustering methods and provided new dataset support for supervised methods.
Keyphrases
  • single cell
  • rna seq
  • machine learning
  • high throughput
  • data analysis
  • cell therapy
  • climate change
  • computed tomography
  • mesenchymal stem cells
  • type diabetes
  • adipose tissue
  • gene expression