Login / Signup

A scalable hierarchical lasso for gene-environment interactions.

Natalia ZemlianskaiaW James GaudermanJuan Pablo Lewinger
Published in: Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America (2022)
We describe a regularized regression model for the selection of gene-environment (G×E) interactions. The model focuses on a single environmental exposure and induces a main-effect-before-interaction hierarchical structure. We propose an efficient fitting algorithm and screening rules that can discard large numbers of irrelevant predictors with high accuracy. We present simulation results showing that the model outperforms existing joint selection methods for (G×E) interactions in terms of selection performance, scalability and speed, and provide a real data application. Our implementation is available in the gesso R package.
Keyphrases
  • primary care
  • genome wide
  • healthcare
  • machine learning
  • gene expression
  • electronic health record
  • deep learning
  • big data