BrainGENIE: The Brain Gene Expression and Network Imputation Engine.
Jonathan L HessThomas P QuinnChunling ZhangGentry C HearnSamuel Chennull nullSek Won KongMurray J CairnsMing T TsuangStephen V FaraoneStephen J GlattPublished in: Translational psychiatry (2023)
In vivo experimental analysis of human brain tissue poses substantial challenges and ethical concerns. To address this problem, we developed a computational method called the Brain Gene Expression and Network-Imputation Engine (BrainGENIE) that leverages peripheral-blood transcriptomes to predict brain tissue-specific gene-expression levels. Paired blood-brain transcriptomic data collected by the Genotype-Tissue Expression (GTEx) Project was used to train BrainGENIE models to predict gene-expression levels in ten distinct brain regions using whole-blood gene-expression profiles. The performance of BrainGENIE was compared to PrediXcan, a popular method for imputing gene expression levels from genotypes. BrainGENIE significantly predicted brain tissue-specific expression levels for 2947-11,816 genes (false-discovery rate-adjusted p < 0.05), including many transcripts that cannot be predicted significantly by a transcriptome-imputation method such as PrediXcan. BrainGENIE recapitulated measured diagnosis-related gene-expression changes in the brain for autism, bipolar disorder, and schizophrenia better than direct correlations from blood and predictions from PrediXcan. We developed a convenient software toolset for deploying BrainGENIE, and provide recommendations for how best to implement models. BrainGENIE complements and, in some ways, outperforms existing transcriptome-imputation tools, providing biologically meaningful predictions and opening new research avenues.
Keyphrases
- gene expression
- dna methylation
- white matter
- resting state
- bipolar disorder
- genome wide
- cerebral ischemia
- single cell
- poor prognosis
- peripheral blood
- rna seq
- autism spectrum disorder
- multiple sclerosis
- major depressive disorder
- electronic health record
- high throughput
- long non coding rna
- big data
- copy number
- high resolution
- blood brain barrier
- deep learning