Prediction Model of Ocular Metastases in Gastric Adenocarcinoma: Machine Learning-Based Development and Interpretation Study.
Jie ZouYan-Kun ShenShi-Nan WuHong WeiQing-Jian LiSan Hua XuQian LingMin KangZhao-Lin LiuHui HuangXu ChenYi-Xin WangXu-Lin LiaoGang TanYi ShaoPublished in: Technology in cancer research & treatment (2024)
Background: Although gastric adenocarcinoma (GA) related ocular metastasis (OM) is rare, its occurrence indicates a more severe disease. We aimed to utilize machine learning (ML) to analyze the risk factors of GA-related OM and predict its risks. Methods: This is a retrospective cohort study. The clinical data of 3532 GA patients were collected and randomly classified into training and validation sets in a ratio of 7:3. Those with or without OM were classified into OM and non-OM (NOM) groups. Univariate and multivariate logistic regression analyses and least absolute shrinkage and selection operator were conducted. We integrated the variables identified through feature importance ranking and further refined the selection process using forward sequential feature selection based on random forest (RF) algorithm before incorporating them into the ML model. We applied six ML algorithms to construct the predictive GA model. The area under the receiver operating characteristic (ROC) curve indicated the model's predictive ability. Also, we established a network risk calculator based on the best performance model. We used Shapley additive interpretation (SHAP) to identify risk factors and to confirm the interpretability of the black box model. We have de-identified all patient details. Results: The ML model, consisting of 13 variables, achieved an optimal predictive performance using the gradient boosting machine (GBM) model, with an impressive area under the curve (AUC) of 0.997 in the test set. Utilizing the SHAP method, we identified crucial factors for OM in GA patients, including LDL, CA724, CEA, AFP, CA125, Hb, CA153, and Ca 2+ . Additionally, we validated the model's reliability through an analysis of two patient cases and developed a functional online web prediction calculator based on the GBM model. Conclusion: We used the ML method to establish a risk prediction model for GA-related OM and showed that GBM performed best among the six ML models. The model may identify patients with GA-related OM to provide early and timely treatment.
Keyphrases
- machine learning
- pet ct
- risk factors
- end stage renal disease
- deep learning
- chronic kidney disease
- squamous cell carcinoma
- ejection fraction
- healthcare
- case report
- climate change
- prognostic factors
- social media
- artificial intelligence
- big data
- binding protein
- protein kinase
- peritoneal dialysis
- smoking cessation
- data analysis
- optic nerve