Login / Signup

High dimensional mediation analysis with latent variables.

Andriy DerkachRuth M PfeifferTing-Huei ChenJoshua N Sampson
Published in: Biometrics (2019)
We propose a model for high dimensional mediation analysis that includes latent variables. We describe our model in the context of an epidemiologic study for incident breast cancer with one exposure and a large number of biomarkers (i.e., potential mediators). We assume that the exposure directly influences a group of latent, or unmeasured, factors which are associated with both the outcome and a subset of the biomarkers. The biomarkers associated with the latent factors linking the exposure to the outcome are considered "mediators." We derive the likelihood for this model and develop an expectation-maximization algorithm to maximize an L1-penalized version of this likelihood to limit the number of factors and associated biomarkers. We show that the resulting estimates are consistent and that the estimates of the nonzero parameters have an asymptotically normal distribution. In simulations, procedures based on this new model can have significantly higher power for detecting the mediating biomarkers compared with the simpler approaches. We apply our method to a study that evaluates the relationship between body mass index, 481 metabolic measurements, and estrogen-receptor positive breast cancer.
Keyphrases
  • body mass index
  • estrogen receptor
  • positive breast cancer
  • machine learning
  • cardiovascular disease
  • physical activity
  • social support
  • weight gain
  • deep learning
  • risk assessment
  • weight loss