Login / Signup

Improving prediction of linear regression models by integrating external information from heterogeneous populations: James-Stein estimators.

Peisong HanHaoyue LiSung Kyun ParkBhramar MukherjeeJeremy M G Taylor
Published in: Biometrics (2024)
We consider the setting where (1) an internal study builds a linear regression model for prediction based on individual-level data, (2) some external studies have fitted similar linear regression models that use only subsets of the covariates and provide coefficient estimates for the reduced models without individual-level data, and (3) there is heterogeneity across these study populations. The goal is to integrate the external model summary information into fitting the internal model to improve prediction accuracy. We adapt the James-Stein shrinkage method to propose estimators that are no worse and are oftentimes better in the prediction mean squared error after information integration, regardless of the degree of study population heterogeneity. We conduct comprehensive simulation studies to investigate the numerical performance of the proposed estimators. We also apply the method to enhance a prediction model for patella bone lead level in terms of blood lead level and other covariates by integrating summary information from published literature.
Keyphrases
  • health information
  • magnetic resonance
  • electronic health record
  • magnetic resonance imaging
  • social media
  • big data
  • body composition
  • artificial intelligence