Improving estimation and prediction in linear regression incorporating external information from an established reduced model.
Wenting ChengJeremy M G TaylorPantel S VokonasSung Kyun ParkBhramar MukherjeePublished in: Statistics in medicine (2018)
We consider a situation where there is rich historical data available for the coefficients and their standard errors in a linear regression model describing the association between a continuous outcome variable Y and a set of predicting factors X, from a large study. We would like to use this summary information for improving inference in an expanded model of interest, Y given X,B. The additional variable B is a new biomarker, measured on a small number of subjects in a new dataset. We formulate the problem in an inferential framework where the historical information is translated in terms of nonlinear constraints on the parameter space and propose both frequentist and Bayes solutions to this problem. We show that a Bayesian transformation approach proposed by Gunn and Dunson is a simple and effective computational method to conduct approximate Bayesian inference for this constrained parameter problem. The simulation results comparing these methods indicate that historical information on E(Y|X) can improve the efficiency of estimation and enhance the predictive power in the regression model of interest E(Y|X,B). We illustrate our methodology by enhancing a published prediction model for bone lead levels in terms of blood lead and other covariates, with a new biomarker defined through a genetic risk score.