Minimum entropy framework identifies a novel class of genomic functional elements and reveals regulatory mechanisms at human disease loci.
Michael J BettiMelinda C AldrichEric R GamazonPublished in: bioRxiv : the preprint server for biology (2023)
We introduce CoRE-BED, a framework trained using 19 epigenomic features encompassing 33 major cell and tissue types to predict cell-type-specific regulatory function. CoRE-BED's interpretability facilitates causal inference and functional prioritization. CoRE-BED identifies nine functional classes de-novo , capturing both known and completely new regulatory categories. Notably, we describe a previously uncharacterized class termed Development Associated Elements (DAEs), which are highly enriched in stem-like cell types and distinguished by dual presence of either H3K4me2 and H3K9ac or H3K79me3 and H4K20me1. Unlike bivalent promoters, which represent a transitory state between active and silenced promoters, DAEs transition directly to or from a non-functional state during stem cell differentiation and are proximal to highly expressed genes. Across 70 GWAS traits, SNPs disrupting CoRE-BED elements explain nearly all SNP heritability, despite encompassing a fraction of all SNPs. Notably, we provide evidence that DAEs are implicated in neurodegeneration. Collectively, our results show CoRE-BED is an effective prioritization tool for post-GWAS analysis.