Imputing abundance of over 2500 surface proteins from single-cell transcriptomes with context-agnostic zero-shot deep ensembles.
Ruoqiao ChenJiayu ZhouBin ChenPublished in: bioRxiv : the preprint server for biology (2024)
Cell surface proteins serve as primary drug targets and cell identity markers. The emergence of techniques like CITE-seq has enabled simultaneous quantification of surface protein abundance and transcript expression for multimodal data analysis within individual cells. The published data have been utilized to train machine learning models for predicting surface protein abundance based solely from transcript expression. However, the small scale of proteins predicted and the poor generalization ability for these computational approaches across diverse contexts, such as different tissues or disease states, impede their widespread adoption. Here we propose SPIDER (surface protein prediction using deep ensembles from single-cell RNA-seq), a context-agnostic zero-shot deep ensemble model, which enables the large-scale prediction of cell surface protein abundance and generalizes better to various contexts. Comprehensive benchmarking shows that SPIDER outperforms other state-of-the-art methods. Using the predicted surface abundance of >2500 proteins from single-cell transcriptomes, we demonstrate the broad applications of SPIDER including cell type annotation, biomarker/target identification, and cell-cell interaction analysis in hepatocellular carcinoma and colorectal cancer.
Keyphrases
- single cell
- rna seq
- cell surface
- high throughput
- data analysis
- antibiotic resistance genes
- binding protein
- machine learning
- protein protein
- poor prognosis
- amino acid
- electronic health record
- stem cells
- emergency department
- systematic review
- randomized controlled trial
- small molecule
- mass spectrometry
- microbial community
- cell death
- bone marrow
- pain management
- oxidative stress
- cell proliferation
- deep learning
- genome wide
- drug induced
- chronic pain
- bioinformatics analysis