A Bayesian model for unsupervised detection of RNA splicing based subtypes in cancers.
David WangMathieu Quesnel-VallieresSan JewellMoein ElzubeirKristen LynchAndrei Thomas-TikhonenkoYoseph BarashPublished in: Nature communications (2023)
Identification of cancer sub-types is a pivotal step for developing personalized treatment. Specifically, sub-typing based on changes in RNA splicing has been motivated by several recent studies. We thus develop CHESSBOARD, an unsupervised algorithm tailored for RNA splicing data that captures "tiles" in the data, defined by a subset of unique splicing changes in a subset of patients. CHESSBOARD allows for a flexible number of tiles, accounts for uncertainty of splicing quantification, and is able to model missing values as additional signals. We first apply CHESSBOARD to synthetic data to assess its domain specific modeling advantages, followed by analysis of several leukemia datasets. We show detected tiles are reproducible in independent studies, investigate their possible regulatory drivers and probe their relation to known AML mutations. Finally, we demonstrate the potential clinical utility of CHESSBOARD by supplementing mutation based diagnostic assays with discovered splicing profiles to improve drug response correlation.
Keyphrases
- machine learning
- electronic health record
- acute myeloid leukemia
- end stage renal disease
- big data
- chronic kidney disease
- ejection fraction
- newly diagnosed
- bone marrow
- emergency department
- squamous cell carcinoma
- deep learning
- patient reported outcomes
- climate change
- nucleic acid
- drug induced
- single cell
- patient reported
- risk assessment