A supervised machine learning approach identifies gene-regulating factor-mediated competing endogenous RNA networks in hormone-dependent cancers.
Dulari K JayarathnaMiguel E RenteriaJyotsna BatraNeha S GandhiPublished in: Journal of cellular biochemistry (2022)
Competing endogenous RNAs (ceRNAs) have become an emerging topic in cancer research due to their role in gene regulatory networks. To date, traditional ceRNA bioinformatic studies have investigated microRNAs as the only factor regulating gene expression. Growing evidence suggests that genomic (e.g., copy number alteration [CNA]), transcriptomic (e.g., transcription factors [TFs]), and epigenomic (e.g., DNA methylation [DM]) factors can influence ceRNA regulatory networks. Herein, we used the Least absolute shrinkage and selection operator regression, a machine learning approach, to integrate DM, CNA, and TFs data with RNA expression to infer ceRNA networks in cancer risk. The gene-regulating factors-mediated ceRNA networks were identified in four hormone-dependent (HD) cancer types: prostate, breast, colorectal, and endometrial. The shared ceRNAs across HD cancer types were further investigated using survival analysis, functional enrichment analysis, and protein-protein interaction network analysis. We found two (BUB1 and EXO1) and one (RRM2) survival-significant ceRNA(s) shared across breast-colorectal-endometrial and prostate-colorectal-endometrial combinations, respectively. Both BUB1 and BUB1B genes were identified as shared ceRNAs across more than two HD cancers of interest. These genes play a critical role in cell division, spindle-assembly checkpoint signalling, and correct chromosome alignment. Furthermore, shared ceRNAs across multiple HD cancers have been involved in essential cancer pathways such as cell cycle, p53 signalling, and chromosome segregation. Identifying ceRNAs' roles across multiple related cancers will improve our understanding of their shared disease biology. Moreover, it contributes to the knowledge of RNA-mediated cancer pathogenesis.
Keyphrases
- copy number
- genome wide
- papillary thyroid
- machine learning
- dna methylation
- cell cycle
- gene expression
- long non coding rna
- mitochondrial dna
- squamous cell
- prostate cancer
- healthcare
- transcription factor
- single cell
- protein protein
- big data
- poor prognosis
- artificial intelligence
- stem cells
- genome wide identification
- dna damage
- network analysis
- endometrial cancer
- oxidative stress
- electronic health record
- rna seq
- dna binding