Combining SNP-to-gene linking strategies to identify disease genes and assess disease omnigenicity.
Steven GazalOmer WeissbrodFarhad HormozdiariKushal K DeyJoseph NasserKarthik A JagadeeshDaniel J WeinerHuwenbo ShiCharles P FulcoLuke O' ConnorBogdan PasaniucJesse M EngreitzAlkes L PricePublished in: Nature genetics (2022)
Disease-associated single-nucleotide polymorphisms (SNPs) generally do not implicate target genes, as most disease SNPs are regulatory. Many SNP-to-gene (S2G) linking strategies have been developed to link regulatory SNPs to the genes that they regulate in cis. Here, we developed a heritability-based framework for evaluating and combining different S2G strategies to optimize their informativeness for common disease risk. Our optimal combined S2G strategy (cS2G) included seven constituent S2G strategies and achieved a precision of 0.75 and a recall of 0.33, more than doubling the recall of any individual strategy. We applied cS2G to fine-mapping results for 49 UK Biobank diseases/traits to predict 5,095 causal SNP-gene-disease triplets (with S2G-derived functional interpretation) with high confidence. We further applied cS2G to provide an empirical assessment of disease omnigenicity; we determined that the top 1% of genes explained roughly half of the SNP heritability linked to all genes and that gene-level architectures vary with variant allele frequency.