Explainable Model Using Shapley Additive Explanations Approach on Wound Infection after Wide Soft Tissue Sarcoma Resection: "Big Data" Analysis Based on Health Insurance Review and Assessment Service Hub.
Ji-Hye ChoiYumin ChoiKwang-Sig LeeKi Hoon AhnWoo Young JangPublished in: Medicina (Kaunas, Lithuania) (2024)
Background and Objectives : Soft tissue sarcomas represent a heterogeneous group of malignant mesenchymal tissues. Despite their low prevalence, soft tissue sarcomas present clinical challenges for orthopedic surgeons owing to their aggressive nature, and perioperative wound infections. However, the low prevalence of soft tissue sarcomas has hindered the availability of large-scale studies. This study aimed to analyze wound infections after wide resection in patients with soft tissue sarcomas by employing big data analytics from the Hub of the Health Insurance Review and Assessment Service (HIRA). Materials and Methods : Patients who underwent wide excision of soft tissue sarcomas between 2010 and 2021 were included. Data were collected from the HIRA database of approximately 50 million individuals' information in the Republic of Korea. The data collected included demographic information, diagnoses, prescribed medications, and surgical procedures. Random forest has been used to analyze the major associated determinants. A total of 10,906 observations with complete data were divided into training and validation sets in an 80:20 ratio (8773 vs. 2193 cases). Random forest permutation importance was employed to identify the major predictors of infection and Shapley Additive Explanations (SHAP) values were derived to analyze the directions of associations with predictors. Results : A total of 10,969 patients who underwent wide excision of soft tissue sarcomas were included. Among the study population, 886 (8.08%) patients had post-operative infections requiring surgery. The overall transfusion rate for wide excision was 20.67% (2267 patients). Risk factors among the comorbidities of each patient with wound infection were analyzed and dependence plots of individual features were visualized. The transfusion dependence plot reveals a distinctive pattern, with SHAP values displaying a negative trend for individuals without blood transfusions and a positive trend for those who received blood transfusions, emphasizing the substantial impact of blood transfusions on the likelihood of wound infection. Conclusions : Using the machine learning random forest model and the SHAP values, the perioperative transfusion, male sex, old age, and low SES were important features of wound infection in soft-tissue sarcoma patients.
Keyphrases
- big data
- soft tissue
- health insurance
- end stage renal disease
- risk factors
- ejection fraction
- newly diagnosed
- machine learning
- healthcare
- high grade
- mental health
- data analysis
- cardiac surgery
- deep learning
- emergency department
- social media
- patients undergoing
- case report
- coronary artery bypass
- acute coronary syndrome
- affordable care act
- surgical site infection
- sickle cell disease
- wound healing
- patient reported
- network analysis
- health information