Integrative identification of non-coding regulatory regions driving metastatic prostate cancer.
Brian J WooRuhollah Moussavi-BaygiHeather KarnerMehran KarimzadehKristle GarciaTanvi JoshiKeyi YinAlbertas NavickasLuke A GilbertBo WangHosseinali AsgharianFelix Y FengHani GoodarziPublished in: bioRxiv : the preprint server for biology (2023)
Large-scale sequencing efforts of thousands of tumor samples have been undertaken to understand the mutational landscape of the coding genome. However, the vast majority of germline and somatic variants occur within non-coding portions of the genome. These genomic regions do not directly encode for specific proteins, but can play key roles in cancer progression, for example by driving aberrant gene expression control. Here, we designed an integrative computational and experimental framework to identify recurrently mutated non-coding regulatory regions that drive tumor progression. Application of this approach to whole-genome sequencing (WGS) data from a large cohort of metastatic castration-resistant prostate cancer (mCRPC) revealed a large set of recurrently mutated regions. We used (i) in silico prioritization of functional non-coding mutations, (ii) massively parallel reporter assays, and (iii) in vivo CRISPR-interference (CRISPRi) screens in xenografted mice to systematically identify and validate driver regulatory regions that drive mCRPC. We discovered that one of these enhancer regions, GH22I030351, acts on a bidirectional promoter to simultaneously modulate expression of U2-associated splicing factor SF3A1 and chromosomal protein CCDC157. We found that both SF3A1 and CCDC157 are promoters of tumor growth in xenograft models of prostate cancer. We nominated a number of transcription factors, including SOX6, to be responsible for higher expression of SF3A1 and CCDC157. Collectively, we have established and confirmed an integrative computational and experimental approach that enables the systematic detection of non-coding regulatory regions that drive the progression of human cancers.
Keyphrases
- transcription factor
- prostate cancer
- gene expression
- poor prognosis
- genome wide
- copy number
- small cell lung cancer
- squamous cell carcinoma
- dna methylation
- binding protein
- crispr cas
- endothelial cells
- high throughput
- radical prostatectomy
- stem cells
- molecular docking
- type diabetes
- machine learning
- skeletal muscle
- network analysis
- big data
- young adults
- metabolic syndrome
- childhood cancer