The correlation between CpG methylation and gene expression is driven by sequence variants.
Olafur Andri StefanssonBrynja Dogg SigurpalsdottirSolvi RognvaldssonGisli Hreinn HalldorssonKristinn JuliussonGardar SveinbjörnssonBjarni GunnarssonDoruk BeyterHákon JónssonSigurjon Axel GudjonssonThorunn Asta OlafsdottirSaedis SaevarsdottirMagnus Karl MagnussonSigrun Helga LundVinicius TraganteAsmundur OddssonMarteinn Thor HardarsonHannes Petur EggertssonReynir L GudmundssonSverrir SverrissonMichael L FriggeFlorian ZinkHilma HólmHreinn StefanssonThorunn RafnarIngileif JónsdóttirPatrick SulemAgnar HelgasonDaníel F GuðbjartssonBjarni V HalldórssonUnnur ThorsteinsdottirKári StefánssonPublished in: Nature genetics (2024)
Gene promoter and enhancer sequences are bound by transcription factors and are depleted of methylated CpG sites (cytosines preceding guanines in DNA). The absence of methylated CpGs in these sequences typically correlates with increased gene expression, indicating a regulatory role for methylation. We used nanopore sequencing to determine haplotype-specific methylation rates of 15.3 million CpG units in 7,179 whole-blood genomes. We identified 189,178 methylation depleted sequences where three or more proximal CpGs were unmethylated on at least one haplotype. A total of 77,789 methylation depleted sequences (~41%) associated with 80,503 cis-acting sequence variants, which we termed allele-specific methylation quantitative trait loci (ASM-QTLs). RNA sequencing of 896 samples from the same blood draws used to perform nanopore sequencing showed that the ASM-QTL, that is, DNA sequence variability, drives most of the correlation found between gene expression and CpG methylation. ASM-QTLs were enriched 40.2-fold (95% confidence interval 32.2, 49.9) among sequence variants associating with hematological traits, demonstrating that ASM-QTLs are important functional units in the noncoding genome.