Login / Signup

Fridge: Focused fine-tuning of ridge regression for personalized predictions.

Kristoffer H HelltonNils Lid Hjort
Published in: Statistics in medicine (2018)
Statistical prediction methods typically require some form of fine-tuning of tuning parameter(s), with K-fold cross-validation as the canonical procedure. For ridge regression, there exist numerous procedures, but common for all, including cross-validation, is that one single parameter is chosen for all future predictions. We propose instead to calculate a unique tuning parameter for each individual for which we wish to predict an outcome. This generates an individualized prediction by focusing on the vector of covariates of a specific individual. The focused ridge-fridge-procedure is introduced with a 2-part contribution: First we define an oracle tuning parameter minimizing the mean squared prediction error of a specific covariate vector, and then we propose to estimate this tuning parameter by using plug-in estimates of the regression coefficients and error variance parameter. The procedure is extended to logistic ridge regression by using parametric bootstrap. For high-dimensional data, we propose to use ridge regression with cross-validation as the plug-in estimate, and simulations show that fridge gives smaller average prediction error than ridge with cross-validation for both simulated and real data. We illustrate the new concept for both linear and logistic regression models in 2 applications of personalized medicine: predicting individual risk and treatment response based on gene expression data. The method is implemented in the R package fridge.
Keyphrases
  • gene expression
  • electronic health record
  • big data
  • minimally invasive
  • air pollution
  • dna methylation
  • machine learning