Integration of Imaging Genomics Data for the Study of Alzheimer's Disease Using Joint-Connectivity-Based Sparse Nonnegative Matrix Factorization.
Kai WeiWei KongShuaiqun WangPublished in: Journal of molecular neuroscience : MN (2021)
Imaging genetics reveals the connection between microscopic genetics and macroscopic imaging, enabling the identification of disease biomarkers. In this work, we make full use of prior knowledge that has significant reference value for investigating the correlation between the brain and genetics to explore more biologically substantial biomarkers. In this paper, we propose joint-connectivity-based sparse nonnegative matrix factorization (JCB-SNMF). The algorithm simultaneously projects structural magnetic resonance imaging (sMRI), single-nucleotide polymorphism sites (SNPs), and gene expression data onto a common feature space, where heterogeneous variables with large coefficients in the same projection direction form a common module. In addition, the connectivity information for each region of the brain and genetic data are added as prior knowledge to identify regions of interest (ROIs), SNPs, and gene-related risks related to Alzheimer's disease (AD) patients. GraphNet regularization increases the anti-noise performance of the algorithm and the biological interpretability of the results. The simulation results show that compared with other NMF-based algorithms (JNMF, JSNMNMF), JCB-SNMF has better anti-noise performance and can identify and predict biomarkers closely related to AD from significant modules. By constructing a protein-protein interaction (PPI) network, we identified SF3B1, RPS20, and RBM14 as potential biomarkers of AD. We also found some significant SNP-ROI and gene-ROI pairs. Among them, two SNPs rs4472239 and rs11918049 and three genes KLHL8, ZC3H11A, and OSGEPL1 may have effects on the gray matter volume of multiple brain regions. This model provides a new way to further integrate multimodal impact genetic data to identify complex disease association patterns.
Keyphrases
- genome wide
- resting state
- white matter
- functional connectivity
- machine learning
- dna methylation
- gene expression
- protein protein
- magnetic resonance imaging
- electronic health record
- high resolution
- big data
- copy number
- deep learning
- healthcare
- end stage renal disease
- multiple sclerosis
- air pollution
- neural network
- small molecule
- fluorescence imaging
- genome wide identification
- single cell
- computed tomography
- newly diagnosed
- cerebral ischemia
- data analysis
- brain injury
- risk assessment
- health information
- social media