scSwinFormer: A Transformer-Based Cell-Type Annotation Method for scRNA-Seq Data Using Smooth Gene Embedding and Global Features.
Hengyu QinXiumin ShiHan ZhouPublished in: Journal of chemical information and modeling (2024)
Single-cell omics techniques have made it possible to analyze individual cells in biological samples, providing us with a more detailed understanding of cellular heterogeneity and biological systems. Accurate identification of cell types is critical for single-cell RNA sequencing (scRNA-seq) analysis. However, scRNA-seq data are usually high dimensional and sparse, posing a great challenge to analyze scRNA-seq data. Existing cell-type annotation methods are either constrained in modeling scRNA-seq data or lack consideration of long-term dependencies of characterized genes. In this work, we developed a Transformer-based deep learning method, scSwinFormer, for the cell-type annotation of large-scale scRNA-seq data. Sequence modeling of scRNA-seq data is performed using the smooth gene embedding module, and then, the potential dependencies of genes are captured by the self-attention module. Subsequently, the global information inherent in scRNA-seq data is synthesized using the Cell Token, thereby facilitating accurate cell-type annotation. We evaluated the performance of our model against current state-of-the-art scRNA-seq cell-type annotation methods on multiple real data sets. ScSwinFormer outperforms the current state-of-the-art scRNA-seq cell-type annotation methods in both external and benchmark data set experiments.
Keyphrases
- single cell
- rna seq
- genome wide
- electronic health record
- big data
- high throughput
- deep learning
- dna methylation
- gene expression
- healthcare
- mass spectrometry
- high resolution
- copy number
- stem cells
- artificial intelligence
- cell proliferation
- climate change
- mesenchymal stem cells
- transcription factor
- induced apoptosis
- bioinformatics analysis
- pi k akt