Deep learning predicts DNA methylation regulatory variants in the human brain and elucidates the genetics of psychiatric disorders.
Jiyun ZhouQiang ChenPatricia R BraunKira A Perzel MandellAndrew E JaffeHao Yang TanThomas M HydeJoel E KleinmanJames B PotashGen ShinozakiDaniel R WeinbergerShizhong HanPublished in: Proceedings of the National Academy of Sciences of the United States of America (2022)
There is growing evidence for the role of DNA methylation (DNAm) quantitative trait loci (mQTLs) in the genetics of complex traits, including psychiatric disorders. However, due to extensive linkage disequilibrium (LD) of the genome, it is challenging to identify causal genetic variations that drive DNAm levels by population-based genetic association studies. This limits the utility of mQTLs for fine-mapping risk loci underlying psychiatric disorders identified by genome-wide association studies (GWAS). Here we present INTERACT, a deep learning model that integrates convolutional neural networks with transformer, to predict effects of genetic variations on DNAm levels at CpG sites in the human brain. We show that INTERACT-derived DNAm regulatory variants are not confounded by LD, are concentrated in regulatory genomic regions in the human brain, and are convergent with mQTL evidence from genetic association analysis. We further demonstrate that predicted DNAm regulatory variants are enriched for heritability of brain-related traits and improve polygenic risk prediction for schizophrenia across diverse ancestry samples. Finally, we applied predicted DNAm regulatory variants for fine-mapping schizophrenia GWAS risk loci to identify potential novel risk genes. Our study shows the power of a deep learning approach to identify functional regulatory variants that may elucidate the genetic basis of complex traits.
Keyphrases
- genome wide
- copy number
- dna methylation
- deep learning
- convolutional neural network
- transcription factor
- nk cells
- gene expression
- high resolution
- bipolar disorder
- genome wide association
- air pollution
- machine learning
- multiple sclerosis
- high density
- resting state
- blood brain barrier
- human immunodeficiency virus
- hiv infected
- climate change
- white matter
- drug induced
- case control