GWO+RuleFit: rule-based explainable machine-learning combined with heuristics to predict mid-treatment FDG PET response to chemoradiation for locally advanced non-small cell lung cancer.

Chunyan DuanQiantuo LiuJiajie WangQianqian TongFangyun BaiJie HanShouyi WangDaniel S HippeJing ZengStephen R Bowen

Published in: Physics in medicine and biology (2024)

Objective. Vital rules learned from fluorodeoxyglucose positron emission tomography (FDG-PET) radiomics of tumor subregional response can provide clinical decision support for precise treatment adaptation. We combined a rule-based machine learning (ML) model (RuleFit) with a heuristic algorithm (gray wolf optimizer, GWO) for mid-chemoradiation FDG-PET response prediction in patients with locally advanced non-small cell lung cancer. Approach. Tumors subregions were identified using K-means clustering. GWO+RuleFit consists of three main parts: (i) a random forest is constructed based on conventional features or radiomic features extracted from tumor regions or subregions in FDG-PET images, from which the initial rules are generated; (ii) GWO is used for iterative rule selection; (iii) the selected rules are fit to a linear model to make predictions about the target variable. Two target variables were considered: a binary response measure (ΔSUVmean ⩾ 20% decline) for classification and a continuous response measure (ΔSUVmean) for regression. GWO+RuleFit was benchmarked against common ML algorithms and RuleFit, with leave-one-out cross-validated performance evaluated by the area under the receiver operating characteristic curve (AUC) in classification and root-mean-square error (RMSE) in regression. Main results. GWO+RuleFit selected 15 rules from the radiomic feature dataset of 23 patients. For treatment response classification, GWO+RuleFit attained numerically better cross-validated performance than RuleFit across tumor regions and sets of features (AUC: 0.58-0.86 vs. 0.52-0.78, p = 0.170-0.925). GWO+Rulefit also had the best or second-best performance numerically compared to all other algorithms for all conditions. For treatment response regression prediction, GWO+RuleFit (RMSE: 0.162-0.192) performed better numerically for low-dimensional models ( p = 0.097-0.614) and significantly better for high-dimensional models across all tumor regions except one (RMSE: 0.189-0.219, p < 0.004). Significance . The GWO+RuleFit selected rules were interpretable, highlighting distinct radiomic phenotypes that modulated treatment response. GWO+Rulefit achieved parsimonious models while maintaining utility for treatment response prediction, which can aid clinical decisions for patient risk stratification, treatment selection, and biologically driven adaptation. Clinical trial: NCT02773238.

Keyphrases