Login / Signup

Composite kernel machine regression based on likelihood ratio test for joint testing of genetic and gene-environment interaction effect.

Ni ZhaoHaoyu ZhangJennifer J ClarkArnab MaityMichael C Wu
Published in: Biometrics (2019)
Most common human diseases are a result from the combined effect of genes, the environmental factors, and their interactions such that including gene-environment (GE) interactions can improve power in gene mapping studies. The standard strategy is to test the SNPs, one-by-one, using a regression model that includes both the SNP effect and the GE interaction. However, the SNP-by-SNP approach has serious limitations, such as the inability to model epistatic SNP effects, biased estimation, and reduced power. Thus, in this article, we develop a kernel machine regression framework to model the overall genetic effect of a SNP-set, considering the possible GE interaction. Specifically, we use a composite kernel to specify the overall genetic effect via a nonparametric function andwe model additional covariates parametrically within the regression framework. The composite kernel is constructed as a weighted average of two kernels, one corresponding to the genetic main effect and one corresponding to the GE interaction effect. We propose a likelihood ratio test (LRT) and a restricted likelihood ratio test (RLRT) for statistical significance. We derive a Monte Carlo approach for the finite sample distributions of LRT and RLRT statistics. Extensive simulations and real data analysis show that our proposed method has correct type I error and can have higher power than score-based approaches under many situations.
Keyphrases
  • genome wide
  • dna methylation
  • copy number
  • monte carlo
  • data analysis
  • high density
  • gene expression
  • magnetic resonance
  • mass spectrometry
  • genome wide identification
  • molecular dynamics
  • machine learning