Login / Signup

Selecting Invalid Instruments to Improve Mendelian Randomization with Two-Sample Summary Data.

Ashish PatelFrancis J DiTragliaVerena ZuberStephen Burgess
Published in: The annals of applied statistics (2024)
Mendelian randomization (MR) is a widely-used method to estimate the causal relationship between a risk factor and disease. A fundamental part of any MR analysis is to choose appropriate genetic variants as instrumental variables. Genome-wide association studies often reveal that hundreds of genetic variants may be robustly associated with a risk factor, but in some situations investigators may have greater confidence in the instrument validity of only a smaller subset of variants. Nevertheless, the use of additional instruments may be optimal from the perspective of mean squared error even if they are slightly invalid; a small bias in estimation may be a price worth paying for a larger reduction in variance. For this purpose, we consider a method for "focused" instrument selection whereby genetic variants are selected to minimise the estimated asymptotic mean squared error of causal effect estimates. In a setting of many weak and locally invalid instruments, we propose a novel strategy to construct confidence intervals for post-selection focused estimators that guards against the worst case loss in asymptotic coverage. In empirical applications to: (i) validate lipid drug targets; and (ii) investigate vitamin D effects on a wide range of outcomes, our findings suggest that the optimal selection of instruments does not involve only a small number of biologically-justified instruments, but also many potentially invalid instruments.
Keyphrases
  • patient reported outcomes
  • risk factors
  • genome wide association
  • magnetic resonance
  • healthcare
  • metabolic syndrome
  • emergency department
  • machine learning
  • genome wide
  • adipose tissue
  • fatty acid
  • big data