Biophysically Interpretable Inference of Cell Types from Multimodal Sequencing Data.
Tara ChariGennady GorinLior PachterPublished in: bioRxiv : the preprint server for biology (2023)
Multimodal, single-cell genomics technologies enable simultaneous capture of multiple facets of DNA and RNA processing in the cell. This creates opportunities for transcriptome-wide, mechanistic studies of cellular processing in heterogeneous cell types, with applications ranging from inferring kinetic differences between cells, to the role of stochasticity in driving heterogeneity. However, current methods for determining cell types or 'clusters' present in multimodal data often rely on ad hoc or independent treatment of modalities, and assumptions ignoring inherent properties of the count data. To enable interpretable and consistent cell cluster determination from multimodal data, we present meK-Means (mechanistic K-Means) which integrates modalities and learns underlying, shared biophysical states through a unifying model of transcription. In particular, we demonstrate how meK-Means can be used to cluster cells from unspliced and spliced mRNA count modalities. By utilizing the causal, physical relationships underlying these modalities, we identify shared transcriptional kinetics across cells, which induce the observed gene expression profiles, and provide an alternative definition for 'clusters' through the governing parameters of cellular processes.
Keyphrases
- single cell
- rna seq
- high throughput
- cell therapy
- induced apoptosis
- electronic health record
- big data
- pain management
- physical activity
- mesenchymal stem cells
- oxidative stress
- transcription factor
- mental health
- cell death
- dna methylation
- cell proliferation
- peripheral blood
- mass spectrometry
- heat shock protein
- artificial intelligence
- signaling pathway
- circulating tumor
- data analysis