CWAS-Plus: estimating category-wide association of rare noncoding variation from whole-genome sequencing data with cell-type-specific functional data.
Yujin KimMinwoo JeongIn Gyeong KohChanhee KimHyeji LeeJae Hyun KimRonald YurkoIl Bin KimJeongbin ParkDonna M WerlingStephan J SandersJoon-Yong AnPublished in: Briefings in bioinformatics (2024)
Variants in cis-regulatory elements link the noncoding genome to human pathology; however, detailed analytic tools for understanding the association between cell-level brain pathology and noncoding variants are lacking. CWAS-Plus, adapted from a Python package for category-wide association testing (CWAS), enhances noncoding variant analysis by integrating both whole-genome sequencing (WGS) and user-provided functional data. With simplified parameter settings and an efficient multiple testing correction method, CWAS-Plus conducts the CWAS workflow 50 times faster than CWAS, making it more accessible and user-friendly for researchers. Here, we used a single-nuclei assay for transposase-accessible chromatin with sequencing to facilitate CWAS-guided noncoding variant analysis at cell-type-specific enhancers and promoters. Examining autism spectrum disorder WGS data (n = 7280), CWAS-Plus identified noncoding de novo variant associations in transcription factor binding sites within conserved loci. Independently, in Alzheimer's disease WGS data (n = 1087), CWAS-Plus detected rare noncoding variant associations in microglia-specific regulatory elements. These findings highlight CWAS-Plus's utility in genomic disorders and scalability for processing large-scale WGS data and in multiple-testing corrections. CWAS-Plus and its user manual are available at https://github.com/joonan-lab/cwas/ and https://cwas-plus.readthedocs.io/en/latest/, respectively.
Keyphrases
- transcription factor
- electronic health record
- autism spectrum disorder
- big data
- copy number
- endothelial cells
- genome wide
- stem cells
- inflammatory response
- spinal cord
- spinal cord injury
- data analysis
- mesenchymal stem cells
- cell therapy
- working memory
- neuropathic pain
- brain injury
- subarachnoid hemorrhage
- deep learning
- resting state
- genome wide association study