DeepGenePrior: A deep learning model for prioritizing genes affected by copy number variants.
Zahra RahaieHamid R RabieeHamid Alinejad-RoknyPublished in: PLoS computational biology (2023)
The genetic etiology of brain disorders is highly heterogeneous, characterized by abnormalities in the development of the central nervous system that lead to diminished physical or intellectual capabilities. The process of determining which gene drives disease, known as "gene prioritization," is not entirely understood. Genome-wide searches for gene-disease associations are still underdeveloped due to reliance on previous discoveries and evidence sources with false positive or negative relations. This paper introduces DeepGenePrior, a model based on deep neural networks that prioritizes candidate genes in genetic diseases. Using the well-studied Variational AutoEncoder (VAE), we developed a score to measure the impact of genes on target diseases. Unlike other methods that use prior data to select candidate genes, based on the "guilt by association" principle and auxiliary data sources like protein networks, our study exclusively employs copy number variants (CNVs) for gene prioritization. By analyzing CNVs from 74,811 individuals with autism, schizophrenia, and developmental delay, we identified genes that best distinguish cases from controls. Our findings indicate a 12% increase in fold enrichment in brain-expressed genes compared to previous studies and a 15% increase in genes associated with mouse nervous system phenotypes. Furthermore, we identified common deletions in ZDHHC8, DGCR5, and CATG00000022283 among the top genes related to all three disorders, suggesting a common etiology among these clinically distinct conditions. DeepGenePrior is publicly available online at http://git.dml.ir/z_rahaie/DGP to address obstacles in existing gene prioritization studies identifying candidate genes.
Keyphrases
- copy number
- genome wide
- mitochondrial dna
- dna methylation
- deep learning
- genome wide identification
- bipolar disorder
- gene expression
- neural network
- machine learning
- autism spectrum disorder
- physical activity
- healthcare
- white matter
- big data
- resting state
- mental health
- multiple sclerosis
- electronic health record
- social media
- intellectual disability
- transcription factor
- health information
- functional connectivity
- genome wide analysis