Login / Signup

Bi-level structured functional analysis for genome-wide association studies.

Mengyun WuFan WangYeheng GeShuangge MaYang Li
Published in: Biometrics (2023)
Genome-wide association studies (GWAS) have led to great successes in identifying genotype-phenotype associations for complex human diseases. In such studies, the high dimensionality of single nucleotide polymorphisms (SNPs) often makes analysis difficult. Functional analysis, which interprets SNPs densely distributed in a chromosomal region as a continuous process rather than discrete observations, has emerged as a promising avenue for overcoming the high dimensionality challenges. However, the majority of the existing functional studies continue to be individual SNP-based and are unable to sufficiently account for the intricate underpinning structures of SNP data. SNPs are often found in groups (for example, genes or pathways) and have a natural group structure. Additionally, these SNP groups can be highly correlated with coordinated biological functions and interact in a network. Motivated by these unique characteristics of SNP data, we develop a novel bi-level structured functional analysis method and investigate disease-associated genetic variants at the SNP level and SNP group level simultaneously. The penalization technique is adopted for bi-level selection and also to accommodate the group-level network structure. Both the estimation and selection consistency properties are rigorously established. The superiority of the proposed method over alternatives is shown through extensive simulation studies. A Type 2 diabetes SNP data application yields some biologically intriguing results. This article is protected by copyright. All rights reserved.
Keyphrases
  • genome wide
  • genome wide association
  • type diabetes
  • dna methylation
  • machine learning
  • electronic health record
  • big data
  • case control
  • skeletal muscle
  • network analysis
  • neural network
  • artificial intelligence