Set-regression with applications to subgroup analysis.
Ao YuanLida WangMing T TanPublished in: Statistics in medicine (2021)
Regression is a commonly used statistical model. It is the conditional mean of the response given covariates μ ( x ) = E ( Y | X = x ) . However, in some practical problems, the interest is the conditional mean of the response given the covariates belonging to some set A. Notably, in precision medicine and subgroup analysis in clinical trials, the aim is to identify subjects who benefit the most from the treatment, or identify an optimal set in the covariate space which manifests treatment favoritism if a subject's covariates fall in this set and the subject is classified to the favorable treatment subgroup. Existing methods for subgroup analysis achieve this indirectly by using classical regression. This motivates us to develop a new type of regression: set-regression, defined as μ ( A ) = E ( Y | X ∈ A ) which directly addresses the subgroup analysis problem. This extends not only the classical regression model but also improves recursive partitioning and support vector machine approaches, and is particularly suitable for objectives involving optimization of the regression over sets, such as subgroup analysis. We show that the new versatile set-regression identifies the subgroup with increased accuracy. It is easy to use. Simulation studies also show superior performance of the proposed method in finite samples.