A Metabolism-Based Interpretable Machine Learning Prediction Model for Diabetic Retinopathy Risk: A Cross-Sectional Study in Chinese Patients with Type 2 Diabetes.
Guo-Wei ZongWan-Ying WangJun ZhengWei ZhangWei-Ming LuoZhong-Ze FangQiang ZhangPublished in: Journal of diabetes research (2023)
The burden of diabetic retinopathy (DR) is increasing, and the sensitive biomarkers of the disease were not enough. Studies have found that the metabolic profile, such as amino acid (AA) and acylcarnitine (AcylCN), in the early stages of DR patients might have changed, indicating the potential of metabolites to become new biomarkers. We are amid to construct a metabolite-based prediction model for DR risk. This study was conducted on type 2 diabetes (T2D) patients with or without DR. Logistic regression and extreme gradient boosting (XGBoost) prediction models were constructed using the traditional clinical features and the screening features, respectively. Assessing the predictive power of the models in terms of both discrimination and calibration, the optimal model was interpreted using the Shapley Additive exPlanations (SHAP) to quantify the effect of features on prediction. Finally, the XGBoost model incorporating AA and AcylCN variables had the best comprehensive evaluation (ROCAUC = 0.82, PRAUC = 0.44, Brier score = 0.09). C18 : 1OH lower than 0.04 μ mol/L, C18 : 1 lower than 0.70 μ mol/L, threonine higher than 27.0 μ mol/L, and tyrosine lower than 36.0 μ mol/L were associated with an increased risk of developing DR. Phenylalanine higher than 52.0 μ mol/L was associated with a decreased risk of developing DR. In conclusion, our study mainly used AAs and AcylCNs to construct an interpretable XGBoost model to predict the risk of developing DR in T2D patients which is beneficial in identifying high-risk groups and preventing or delaying the onset of DR. In addition, our study proposed possible risk cut-off values for DR of C18 : 1OH, C18 : 1, threonine, tyrosine, and phenylalanine.
Keyphrases
- diabetic retinopathy
- editorial comment
- type diabetes
- machine learning
- end stage renal disease
- ejection fraction
- optical coherence tomography
- chronic kidney disease
- prognostic factors
- physical activity
- newly diagnosed
- cardiovascular disease
- amino acid
- adipose tissue
- ms ms
- risk assessment
- climate change
- deep learning
- skeletal muscle
- human health