Login / Signup

RCRdiff: A fully integrated Bayesian method for differential expression analysis using raw NanoString nCounter data.

Can XuXinlei WangJohan LimGuanghua XiaoYang Xie
Published in: Statistics in medicine (2021)
The medium-throughput mRNA abundance platform NanoString nCounter has gained great popularity in the past decade, due to its high sensitivity and technical reproducibility as well as remarkable applicability to ubiquitous formalin fixed paraffin embedded (FFPE) tissue samples. Based on RCRnorm developed for normalizing NanoString nCounter data and Bayesian LASSO for variable selection, we propose a fully integrated Bayesian method, called RCRdiff, to detect differentially expressed (DE) genes between different groups of tissue samples (eg, normal and cancer). Unlike existing methods that often require normalization performed beforehand, RCRdiff directly handles raw read counts and jointly models the behaviors of different types of internal controls along with DE and non-DE gene patterns. Doing so would avoid efficiency loss caused by ignoring estimation uncertainty from the normalization step in a sequential approach and thus can offer more reliable statistical inference. We also propose clustering-based strategies for DE gene selection, which do not require any external dataset and are free of any arbitrary cutoff. Empirical evidence of the attractiveness of RCRdiff is demonstrated via extensive simulation and data examples.
Keyphrases
  • electronic health record
  • genome wide
  • big data
  • genome wide identification
  • copy number
  • single cell
  • high throughput
  • single molecule
  • dna methylation
  • squamous cell
  • genome wide analysis
  • microbial community