Bi-level structured functional analysis for genome-wide association studies.
Mengyun WuFan WangYeheng GeShuangge MaYang LiPublished in: Biometrics (2023)
Genome-wide association studies (GWAS) have led to great successes in identifying genotype-phenotype associations for complex human diseases. In such studies, the high dimensionality of single nucleotide polymorphisms (SNPs) often makes analysis difficult. Functional analysis, which interprets SNPs densely distributed in a chromosomal region as a continuous process rather than discrete observations, has emerged as a promising avenue for overcoming the high dimensionality challenges. However, the majority of the existing functional studies continue to be individual SNP-based and are unable to sufficiently account for the intricate underpinning structures of SNP data. SNPs are often found in groups (for example, genes or pathways) and have a natural group structure. Additionally, these SNP groups can be highly correlated with coordinated biological functions and interact in a network. Motivated by these unique characteristics of SNP data, we develop a novel bi-level structured functional analysis method and investigate disease-associated genetic variants at the SNP level and SNP group level simultaneously. The penalization technique is adopted for bi-level selection and also to accommodate the group-level network structure. Both the estimation and selection consistency properties are rigorously established. The superiority of the proposed method over alternatives is shown through extensive simulation studies. A Type 2 diabetes SNP data application yields some biologically intriguing results. This article is protected by copyright. All rights reserved.