A support vector machine-based cure rate model for interval censored data.
Suvra PalYingwei PengWisdom AselisewineSandip BaruiPublished in: Statistical methods in medical research (2023)
The mixture cure rate model is the most commonly used cure rate model in the literature. In the context of mixture cure rate model, the standard approach to model the effect of covariates on the cured or uncured probability is to use a logistic function. This readily implies that the boundary classifying the cured and uncured subjects is linear. In this article, we propose a new mixture cure rate model based on interval censored data that uses the support vector machine to model the effect of covariates on the uncured or the cured probability (i.e. on the incidence part of the model). Our proposed model inherits the features of the support vector machine and provides flexibility to capture classification boundaries that are nonlinear and more complex. The latency part is modeled by a proportional hazards structure with an unspecified baseline hazard function. We develop an estimation procedure based on the expectation maximization algorithm to estimate the cured/uncured probability and the latency model parameters. Our simulation study results show that the proposed model performs better in capturing complex classification boundaries when compared to both logistic regression-based and spline regression-based mixture cure rate models. We also show that our model's ability to capture complex classification boundaries improve the estimation results corresponding to the latency part of the model. For illustrative purpose, we present our analysis by applying the proposed methodology to the NASA's Hypobaric Decompression Sickness Database.