Integrating machine learning and single-cell analysis to uncover lung adenocarcinoma progression and prognostic biomarkers.
Pengpeng ZhangJiaqi FengMin RuiJiping XieLianmin ZhangZhen-Fa ZhangPublished in: Journal of cellular and molecular medicine (2024)
The progression of lung adenocarcinoma (LUAD) from atypical adenomatous hyperplasia (AAH) to invasive adenocarcinoma (IAC) involves a complex evolution of tumour cell clusters, the mechanisms of which remain largely unknown. By integrating single-cell datasets and using inferCNV, we identified and analysed tumour cell clusters to explore their heterogeneity and changes in abundance throughout LUAD progression. We applied gene set variation analysis (GSVA), pseudotime analysis, scMetabolism, and Cytotrace scores to study biological functions, metabolic profiles and stemness traits. A predictive model for prognosis, based on key cluster marker genes, was developed using CoxBoost and plsRcox (CPM), and validated across multiple cohorts for its prognostic prediction capabilities, tumour microenvironment characterization, mutation landscape and immunotherapy response. We identified nine distinct tumour cell clusters, with Cluster 6 indicating an early developmental stage, high stemness and proliferative potential. The abundance of Clusters 0 and 6 increased from AAH to IAC, correlating with prognosis. The CPM model effectively distinguished prognosis in immunotherapy cohorts and predicted genomic alterations, chemotherapy drug sensitivity, and immunotherapy responsiveness. Key gene S100A16 in the CPM model was validated as an oncogene, enhancing LUAD cell proliferation, invasion and migration. The CPM model emerges as a novel biomarker for predicting prognosis and immunotherapy response in LUAD patients, with S100A16 identified as a potential therapeutic target.
Keyphrases
- single cell
- rna seq
- stem cells
- machine learning
- genome wide
- cell proliferation
- high throughput
- cell therapy
- copy number
- squamous cell carcinoma
- epithelial mesenchymal transition
- risk assessment
- gene expression
- radiation therapy
- emergency department
- artificial intelligence
- climate change
- cell cycle
- dna methylation
- big data
- human health
- signaling pathway
- genome wide identification
- drug induced