Towards precision oncology discovery: four less known genes and their unknown interactions as highest-performed biomarkers for colorectal cancer.
Yongjun LiuYuqing XuXiaoxing LiMengke ChenXueqin WangNing ZhangHeping ZhangZhengjun ZhangPublished in: NPJ precision oncology (2024)
The goal of this study was to use a new interpretable machine-learning framework based on max-logistic competing risk factor models to identify a parsimonious set of differentially expressed genes (DEGs) that play a pivotal role in the development of colorectal cancer (CRC). Transcriptome data from nine public datasets were analyzed, and a new Chinese cohort was collected to validate the findings. The study discovered a set of four critical DEGs - CXCL8, PSMC2, APP, and SLC20A1 - that exhibit the highest accuracy in detecting CRC in diverse populations and ethnicities. Notably, PSMC2 and CXCL8 appear to play a central role in CRC, and CXCL8 alone could potentially serve as an early-stage marker for CRC. This work represents a pioneering effort in applying the max-logistic competing risk factor model to identify critical genes for human malignancies, and the interpretability and reproducibility of the results across diverse populations suggests that the four DEGs identified can provide a comprehensive description of the transcriptomic features of CRC. The practical implications of this research include the potential for personalized risk assessment and precision diagnosis and tailored treatment plans for patients.
Keyphrases
- genome wide
- early stage
- machine learning
- risk assessment
- risk factors
- end stage renal disease
- rna seq
- gene expression
- healthcare
- ejection fraction
- endothelial cells
- chronic kidney disease
- newly diagnosed
- genome wide identification
- bioinformatics analysis
- radiation therapy
- peritoneal dialysis
- big data
- human health
- electronic health record
- artificial intelligence
- lymph node
- climate change
- high throughput
- genome wide analysis
- genetic diversity
- adverse drug