Identification of ZMYND19 as a novel biomarker of colorectal cancer: RNA-sequencing and machine learning analysis.
Ghazaleh Khalili-TanhaReza MohitAlireza AsadniaMajid KhazaeiMohammad DashtiahangarMina MaftoohMohammadreza NassiriSeyed Mahdi HassanianMajid Ghayour-MobarhanMohammad Ali KianiGordon A FernsJyotsna BatraElham NazariAmir AvanPublished in: Journal of cell communication and signaling (2023)
Colorectal cancer (CRC) is the third most common cause of cancer-related deaths. The five-year relative survival rate for CRC is estimated to be approximately 90% for patients diagnosed with early stages and 14% for those diagnosed at an advanced stages of disease, respectively. Hence, the development of accurate prognostic markers is required. Bioinformatics enables the identification of dysregulated pathways and novel biomarkers. RNA expression profiling was performed in CRC patients from the TCGA database using a Machine Learning approach to identify differential expression genes (DEGs). Survival curves were assessed using Kaplan-Meier analysis to identify prognostic biomarkers. Furthermore, the molecular pathways, protein-protein interaction, the co-expression of DEGs, and the correlation between DEGs and clinical data have been evaluated. The diagnostic markers were then determined based on machine learning analysis. The results indicated that key upregulated genes are associated with the RNA processing and heterocycle metabolic process, including C10orf2, NOP2, DKC1, BYSL, RRP12, PUS7, MTHFD1L, and PPAT. Furthermore, the survival analysis identified NOP58, OSBPL3, DNAJC2, and ZMYND19 as prognostic markers. The combineROC curve analysis indicated that the combination of C10orf2 -PPAT- ZMYND19 can be considered as diagnostic markers with sensitivity, specificity, and AUC values of 0.98, 1.00, and 0.99, respectively. Eventually, ZMYND19 gene was validated in CRC patients. In conclusion, novel biomarkers of CRC have been identified that may be a promising strategy for early diagnosis, potential treatment, and better prognosis.
Keyphrases
- machine learning
- end stage renal disease
- ejection fraction
- newly diagnosed
- prognostic factors
- genome wide
- small molecule
- protein protein
- dna methylation
- emergency department
- risk assessment
- bioinformatics analysis
- genome wide identification
- poor prognosis
- big data
- single cell
- transcription factor
- adverse drug
- patient reported
- genome wide analysis