scROSHI: robust supervised hierarchical identification of single cells.
Michael PrummerAnne BertoliniLars BosshardFlorian BarkmannJosephine YatesValentina Boevanull nullDaniel Johannes StekhovenFranziska SingerPublished in: NAR genomics and bioinformatics (2023)
Identifying cell types based on expression profiles is a pillar of single cell analysis. Existing machine-learning methods identify predictive features from annotated training data, which are often not available in early-stage studies. This can lead to overfitting and inferior performance when applied to new data. To address these challenges we present scROSHI, which utilizes previously obtained cell type-specific gene lists and does not require training or the existence of annotated data. By respecting the hierarchical nature of cell type relationships and assigning cells consecutively to more specialized identities, excellent prediction performance is achieved. In a benchmark based on publicly available PBMC data sets, scROSHI outperforms competing methods when training data are limited or the diversity between experiments is large.
Keyphrases
- machine learning
- electronic health record
- big data
- single cell
- early stage
- induced apoptosis
- artificial intelligence
- data analysis
- rna seq
- gene expression
- palliative care
- squamous cell carcinoma
- deep learning
- endoplasmic reticulum stress
- signaling pathway
- neoadjuvant chemotherapy
- mesenchymal stem cells
- bone marrow
- copy number
- sentinel lymph node