Login / Signup

Binomial models uncover biological variation during feature selection of droplet-based single-cell RNA sequencing.

Breanne SpartaTimothy HamiltonGunalan NatesanSamuel D AragonesEric J Deeds
Published in: PLoS computational biology (2024)
Effective analysis of single-cell RNA sequencing (scRNA-seq) data requires a rigorous distinction between technical noise and biological variation. In this work, we propose a simple feature selection model, termed "Differentially Distributed Genes" or DDGs, where a binomial sampling process for each mRNA species produces a null model of technical variation. Using scRNA-seq data where cell identities have been established a priori, we find that the DDG model of biological variation outperforms existing methods. We demonstrate that DDGs distinguish a validated set of real biologically varying genes, minimize neighborhood distortion, and enable accurate partitioning of cells into their established cell-type groups.
Keyphrases