Self-assembling manifolds in single-cell RNA sequencing data.
Alexander J TarashanskyYuan XuePengyang LiStephen R QuakeBo WangPublished in: eLife (2019)
Single-cell RNA sequencing has spurred the development of computational methods that enable researchers to classify cell types, delineate developmental trajectories, and measure molecular responses to external perturbations. Many of these technologies rely on their ability to detect genes whose cell-to-cell variations arise from the biological processes of interest rather than transcriptional or technical noise. However, for datasets in which the biologically relevant differences between cells are subtle, identifying these genes is challenging. We present the self-assembling manifold (SAM) algorithm, an iterative soft feature selection strategy to quantify gene relevance and improve dimensionality reduction. We demonstrate its advantages over other state-of-the-art methods with experimental validation in identifying novel stem cell populations of Schistosoma mansoni, a prevalent parasite that infects hundreds of millions of people. Extending our analysis to a total of 56 datasets, we show that SAM is generalizable and consistently outperforms other methods in a variety of biological and quantitative benchmarks.
Keyphrases
- single cell
- rna seq
- high throughput
- stem cells
- machine learning
- genome wide
- induced apoptosis
- gene expression
- cell therapy
- depressive symptoms
- deep learning
- mesenchymal stem cells
- computed tomography
- genome wide identification
- transcription factor
- magnetic resonance imaging
- bone marrow
- cell proliferation
- oxidative stress
- bioinformatics analysis
- plasmodium falciparum
- endoplasmic reticulum stress
- data analysis
- genome wide analysis
- contrast enhanced
- toxoplasma gondii