Machine Learning and Novel Biomarkers Associated with Immune Infiltration for the Diagnosis of Esophageal Squamous Cell Carcinoma.
Jipeng ZhangNian ZhangXin YangXiangbin XinCheng-Hui JiaSen LiQiang LuXiaofeng GuoTao WangPublished in: Journal of oncology (2022)
Esophageal squamous cell carcinoma (ESCC) accounts for the main esophageal cancer type, which is related to advanced stage and poor survivals. Therefore, novel diagnostic biomarkers are critically needed. In the current research, we aimed to screen novel diagnostic biomarkers based on machine learning. The expression profiles were obtained from GEO datasets (GSE20347, GSE38129, and GSE75241) and TCGA datasets. Differentially expressed genes (DEGs) were screened between 47 ESCC and 47 nontumor samples. The LASSO regression model and SVM-RFE analysis were carried out for the identification of potential markers. ROC analysis was carried out to assess discriminatory abilities. The expressions and diagnostic values of the candidates in ESCC were demonstrated in the GSE75241 datasets and TCGA datasets. We also explore the correlations between the critical genes and cancer immune infiltrates using CIBERSORT. In this study, we identified 27 DEGs in ESCC: 5 genes were significantly elevated, and 22 genes were significantly decreased. Based on the results of the SVM-RFE and LASSO regression model, we identified five potential diagnostic biomarkers for ESCC, including GPX3, COL11A1, EREG, MMP1, and MMP12. However, the diagnostic values of only GPX3, MMP1, and MMP12 were confirmed in GSE75241 datasets. Moreover, in TCGA datasets, we further confirmed that GPX3 expression was distinctly decreased in ESCC specimens, while the expression of MMP1 and MMP12 was noticeably increased in ESCC specimens. Immune cell infiltration analysis revealed that the expression of GPX3, MMP1, and MMP12 was associated with several immune, such as T cells CD8, macrophages M2, macrophages M0, and dendritic cells activated. Overall, our findings suggested GPX3, MMP1, and MMP12 as novel diagnostic marker and correlated with immune infiltrates in ESCC patients.
Keyphrases
- cell migration
- machine learning
- poor prognosis
- rna seq
- genome wide
- bioinformatics analysis
- end stage renal disease
- chronic kidney disease
- artificial intelligence
- newly diagnosed
- gene expression
- squamous cell carcinoma
- deep learning
- prognostic factors
- binding protein
- climate change
- transcription factor
- childhood cancer