Login / Signup

Temporal Prediction of Future State Occupation in a Multistate Model from High-Dimensional Baseline Covariates via Pseudo-Value Regression.

Sandipan DuttaSusmita DattaSomnath Datta
Published in: Journal of statistical computation and simulation (2016)
In many complex diseases such as cancer, a patient undergoes various disease stages before reaching a terminal state (say disease free or death). This fits a multistate model framework where a prognosis may be equivalent to predicting the state occupation at a future time t. With the advent of high throughput genomic and proteomic assays, a clinician may intent to use such high dimensional covariates in making better prediction of state occupation. In this article, we offer a practical solution to this problem by combining a useful technique, called pseudo value regression, with a latent factor or a penalized regression method such as the partial least squares (PLS) or the least absolute shrinkage and selection operator (LASSO), or their variants. We explore the predictive performances of these combinations in various high dimensional settings via extensive simulation studies. Overall, this strategy works fairly well provided the models are tuned properly. Overall, the PLS turns out to be slightly better than LASSO in most settings investigated by us, for the purpose of temporal prediction of future state occupation. We illustrate the utility of these pseudo-value based high dimensional regression methods using a lung cancer data set where we use the patients' baseline gene expression values.
Keyphrases