Login / Signup

cytoKernel: Robust kernel embeddings for assessing differential expression of single cell data.

Tusharkanti GhoshRyan M BaxterSouvik SealVictor G LuiPratyaydipta RudraThao VuElena W Y HsiehDebashis Ghosh
Published in: bioRxiv : the preprint server for biology (2024)
High-throughput sequencing of single-cell data can be used to rigorously evlauate cell specification and enable intricate variations between groups or conditions. Many popular existing methods for differential expression target differences in aggregate measurements (mean, median, sum) and limit their approaches to detect only global differential changes. We present a robust method for differential expression of single-cell data using a kernel-based score test, cytoKernel. cytoKernel is specifically designed to assess the differential expression of single cell RNA sequencing and high-dimensional flow or mass cytometry data using the full probability distribution pattern. cytoKernel is based on kernel embeddings which employs the probability distributions of the single cell data, by calculating the pairwise divergence/distance between distributions of subjects. It can detect both patterns involving aggregate changes, as well as more elusive variations that are often overlooked due to the multimodal characteristics of single cell data. We performed extensive benchmarks across both simulated and real data sets from mass cytometry data and single-cell RNA sequencing. The cytoKernel procedure effectively controls the False Discovery Rate (FDR) and shows favourable performance compared to existing methods. The method is able to identify more differential patterns than existing approaches. We apply cytoKernel to assess gene expression and protein marker expression differences from cell subpopulations in various publicly available single-cell RNAseq and mass cytometry data sets. The methods described in this paper are implemented in the open-source R package cytoKernel, which is freely available from Bioconductor at \url{http://bioconductor.org/packages/cytoKernel}.
Keyphrases
  • single cell
  • rna seq
  • high throughput
  • electronic health record
  • gene expression
  • small molecule
  • deep learning
  • artificial intelligence
  • pain management