Machine learning to predict hemodynamically significant CAD based on traditional risk factors, coronary artery calcium and epicardial fat volume.
Wenji YuLe YangFeifei ZhangBao LiuYunmei ShiJianfeng WangXiaoliang ShaoYongjun ChenXiaoyu YangYue-Tao WangPublished in: Journal of nuclear cardiology : official publication of the American Society of Nuclear Cardiology (2023)
We sought to establish an explainable machine learning (ML) model to screen for hemodynamically significant coronary artery disease (CAD) based on traditional risk factors, coronary artery calcium (CAC) and epicardial fat volume (EFV) measured from non-contrast CT scans. 184 symptomatic inpatients who underwent Single Photon Emission Computed Tomography/Myocardial Perfusion Imaging (SPECT/MPI) and Invasive Coronary Angiography (ICA) were enrolled. Clinical and imaging features (CAC and EFV) were collected. Hemodynamically significant CAD was defined when coronary stenosis severity ≥ 50% with a matched reversible perfusion defect in SPECT/MPI. Data was randomly split into a training cohort (70%) on which five-fold cross-validation was done and a test cohort (30%). The normalized training phase was preceded by the selection of features using recursive feature elimination (RFE). Three ML classifiers (LR, SVM, and XGBoost) were used to construct and choose the best predictive model for hemodynamically significant CAD. An explainable approach based on ML and the SHapley Additive exPlanations (SHAP) method was deployed to generate individual explanation of the model's decision. In the training cohort, hemodynamically significant CAD patients had significantly higher age, BMI and EFV, higher proportions of hypertension and CAC comparing with controls (P all < .05). In the test cohorts, hemodynamically significant CAD had significantly higher EFV and higher proportion of CAC. EFV, CAC, diabetes mellitus (DM), hypertension, and hyperlipidemia were the highest ranking features by RFE. XGBoost produced better performance (AUC of 0.88) compared with traditional LR model (AUC of 0.82) and SVM (AUC of 0.82) in the training cohort. Decision Curve Analysis (DCA) demonstrated that XGBoost model had the highest Net Benefit index. Validation of the model also yielded a favorable discriminatory ability with the AUC, sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV) and accuracy of 0.89, 68.0%, 96.8%, 94.4%, 79.0% and 83.9% in the XGBoost model. A XGBoost model based on EFV, CAC, hypertension, DM and hyperlipidemia to assess hemodynamically significant CAD was constructed and validated, which showed favorable predictive value. ML combined with SHAP can offer a transparent explanation of personalized risk prediction, enabling physicians to gain an intuitive understanding of the impact of key features in the model.
Keyphrases
- coronary artery disease
- coronary artery
- machine learning
- computed tomography
- risk factors
- blood pressure
- magnetic resonance imaging
- type diabetes
- magnetic resonance
- high resolution
- adipose tissue
- end stage renal disease
- cardiovascular events
- percutaneous coronary intervention
- pulmonary hypertension
- chronic kidney disease
- physical activity
- skeletal muscle
- wastewater treatment
- prognostic factors
- big data
- ejection fraction
- high fat diet
- peritoneal dialysis
- high throughput
- coronary artery bypass grafting
- electronic health record
- deep learning
- image quality
- aortic valve
- neural network