Comethyl: a network-based methylome approach to investigate the multivariate nature of health and disease.
Charles E MordauntJulia S MouatRebecca J SchmidtJanine M LaSallePublished in: Briefings in bioinformatics (2022)
Health outcomes are frequently shaped by difficult to dissect inter-relationships between biological, behavioral, social and environmental factors. DNA methylation patterns reflect such multivariate intersections, providing a rich source of novel biomarkers and insight into disease etiologies. Recent advances in whole-genome bisulfite sequencing enable investigation of DNA methylation over all genomic CpGs, but existing bioinformatic approaches lack accessible system-level tools. Here, we develop the R package Comethyl, for weighted gene correlation network analysis of user-defined genomic regions that generates modules of comethylated regions, which are then tested for correlations with multivariate sample traits. First, regions are defined by CpG genomic location or regulatory annotation and filtered based on CpG count, sequencing depth and variability. Next, correlation networks are used to find modules of interconnected nodes using methylation values within the selected regions. Each module containing multiple comethylated regions is reduced in complexity to a single eigennode value, which is then tested for correlations with experimental metadata. Comethyl has the ability to cover the noncoding regulatory regions of the genome with high relevance to interpretation of genome-wide association studies and integration with other types of epigenomic data. We demonstrate the utility of Comethyl on a dataset of male cord blood samples from newborns later diagnosed with autism spectrum disorder (ASD) versus typical development. Comethyl successfully identified an ASD-associated module containing regions mapped to genes enriched for brain glial functions. Comethyl is expected to be useful in uncovering the multivariate nature of health disparities for a variety of common disorders. Comethyl is available at github.com/cemordaunt/comethyl with complete documentation and example analyses.
Keyphrases
- dna methylation
- genome wide
- copy number
- cord blood
- healthcare
- autism spectrum disorder
- mental health
- gene expression
- data analysis
- attention deficit hyperactivity disorder
- magnetic resonance
- pregnant women
- transcription factor
- magnetic resonance imaging
- network analysis
- squamous cell carcinoma
- multiple sclerosis
- genome wide association
- big data
- computed tomography
- optical coherence tomography
- lymph node
- intellectual disability
- neuropathic pain
- brain injury
- machine learning
- peripheral blood
- neoadjuvant chemotherapy
- working memory
- radiation therapy
- health insurance
- genome wide analysis
- low birth weight