CoCoA-diff: counterfactual inference for single-cell gene expression analysis.
Yongjin P ParkManolis KellisPublished in: Genome biology (2021)
Finding a causal gene is a fundamental problem in genomic medicine. We present a causal inference framework, CoCoA-diff, that prioritizes disease genes by adjusting confounders without prior knowledge of control variables in single-cell RNA-seq data. We demonstrate that our method substantially improves statistical power in simulations and real-world data analysis of 70k brain cells collected for dissecting Alzheimer's disease. We identify 215 differentially regulated causal genes in various cell types, including highly relevant genes with a proper cell type context. Genes found in different types enrich distinctive pathways, implicating the importance of cell types in understanding multifaceted disease mechanisms.
Keyphrases
- single cell
- rna seq
- genome wide identification
- genome wide
- transcription factor
- high throughput
- genome wide analysis
- copy number
- bioinformatics analysis
- dna methylation
- induced apoptosis
- electronic health record
- healthcare
- big data
- stem cells
- cell therapy
- cell cycle arrest
- artificial intelligence
- data analysis
- mild cognitive impairment
- white matter
- machine learning
- pi k akt