Machine learning identifies activation of RUNX/AP-1 as drivers of mesenchymal and fibrotic regulatory programs in gastric cancer.
Milad Razavi-MohseniWeitai HuangYu A GuoDustin ShigakiShamaine Wei Ting HoPatrick TanAnders J SkanderupMichael A BeerPublished in: Genome research (2024)
Gastric cancer (GC) is the fifth most common cancer worldwide and is a heterogeneous disease. Among GC subtypes, the mesenchymal phenotype (Mes-like) is more invasive than the epithelial phenotype (Epi-like). Although gene expression of the epithelial-to-mesenchymal transition (EMT) has been studied, the regulatory landscape shaping this process is not fully understood. Here we use ATAC-seq and RNA-seq data from a compendium of GC cell lines and primary tumors to detect drivers of regulatory state changes and their transcriptional responses. Using the ATAC-seq data, we developed a machine learning approach to determine the transcription factors (TFs) regulating the subtypes of GC. We identified TFs driving the mesenchymal (RUNX2, ZEB1, SNAI2, AP-1 dimer) and the epithelial (GATA4, GATA6, KLF5, HNF4A, FOXA2, GRHL2) states in GC. We identified DNA copy number alterations associated with dysregulation of these TFs, specifically deletion of GATA4 and amplification of MAPK9 Comparisons with bulk and single-cell RNA-seq data sets identified activation toward fibroblast-like epigenomic and expression signatures in Mes-like GC. The activation of this mesenchymal fibrotic program is associated with differentially accessible DNA cis -regulatory elements flanking upregulated mesenchymal genes. These findings establish a map of TF activity in GC and highlight the role of copy number driven alterations in shaping epigenomic regulatory programs as potential drivers of GC heterogeneity and progression.
Keyphrases
- transcription factor
- rna seq
- single cell
- copy number
- genome wide
- machine learning
- mitochondrial dna
- gas chromatography
- bone marrow
- dna binding
- genome wide identification
- gene expression
- stem cells
- dna methylation
- high throughput
- big data
- epithelial mesenchymal transition
- electronic health record
- circulating tumor
- public health
- systemic sclerosis
- mass spectrometry
- artificial intelligence
- high resolution
- risk assessment
- toll like receptor
- deep learning
- young adults
- squamous cell carcinoma
- pi k akt
- idiopathic pulmonary fibrosis
- heat shock
- simultaneous determination
- liquid chromatography