Login / Signup

Robust gene-environment interaction analysis using penalized trimmed regression.

Yaqing XuMengyun WuShuangge MaSyed Ejaz Ahmed
Published in: Journal of statistical computation and simulation (2018)
In biomedical and epidemiological studies, gene-environment (G-E) interactions have been shown to importantly contribute to the etiology and progression of many complex diseases. Most existing approaches for identifying G-E interactions are limited by the lack of robustness against outliers/contaminations in response and predictor spaces. In this study, we develop a novel robust G-E identification approach using the trimmed regression technique under joint modeling. A robust data-driven criterion and stability selection are adopted to determine the trimmed subset which is free from both vertical outliers and leverage points. An effective penalization approach is developed to identify important G-E interactions, respecting the "main effects, interactions" hierarchical structure. Extensive simulations demonstrate the better performance of the proposed approach compared to multiple alternatives. Interesting findings with superior prediction accuracy and stability are observed in the analysis of TCGA data on cutaneous melanoma and breast invasive carcinoma.
Keyphrases
  • copy number
  • genome wide
  • molecular dynamics
  • electronic health record
  • dna methylation
  • big data
  • artificial intelligence
  • data analysis