Coalitional Game Theory Facilitates Identification of Non-Coding Variants Associated With Autism.
Min Woo SunAnika GuptaMaya VarmaKelley M PaskovJae-Yoon JungNate T StockhamDennis Paul WallPublished in: Biomedical informatics insights (2019)
Studies on autism spectrum disorder (ASD) have amassed substantial evidence for the role of genetics in the disease's phenotypic manifestation. A large number of coding and non-coding variants with low penetrance likely act in a combinatorial manner to explain the variable forms of ASD. However, many of these combined interactions, both additive and epistatic, remain undefined. Coalitional game theory (CGT) is an approach that seeks to identify players (individual genetic variants or genes) who tend to improve the performance-association to a disease phenotype of interest-of any coalition (subset of co-occurring genetic variants) they join. This method has been previously applied to boost biologically informative signal from gene expression data and exome sequencing data but remains to be explored in the context of cooperativity among non-coding genomic regions. We describe our extension of previous work, highlighting non-coding chromosomal regions relevant to ASD using CGT on alteration data of 4595 fully sequenced genomes from 756 multiplex families. Genomes were encoded into binary matrices for three types of non-coding regions previously implicated in ASD and separated into ASD (case) and unaffected (control) samples. A player metric, the Shapley value, enabled determination of individual variant contributions in both sets of cohorts. A total of 30 non-coding positions were found to have significantly elevated player scores and likely represent significant contributors to the genetic coordination underlying ASD. Cross-study analyses revealed that a subset of mutated non-coding regions (all of which are in human accelerated regions (HARs)) and related genes are involved in biological pathways or behavioral outcomes known to be affected in autism, suggesting the importance of single nucleotide polymorphisms (SNPs) within HARs in ASD. These findings support the use of CGT in identifying hidden yet influential non-coding players from large-scale genomic data, to better understand the precise underpinnings of complex neurodevelopmental disorders such as autism.