Promoter-enhancer interactions identified from Hi-C data using probabilistic models and hierarchical topological domains.
Gil RonYuval GlobersonDror MoranTommy KaplanPublished in: Nature communications (2017)
Proximity-ligation methods such as Hi-C allow us to map physical DNA-DNA interactions along the genome, and reveal its organization into topologically associating domains (TADs). As the Hi-C data accumulate, computational methods were developed for identifying domain borders in multiple cell types and organisms. Here, we present PSYCHIC, a computational approach for analyzing Hi-C data and identifying promoter-enhancer interactions. We use a unified probabilistic model to segment the genome into domains, which we then merge hierarchically and fit using a local background model, allowing us to identify over-represented DNA-DNA interactions across the genome. By analyzing the published Hi-C data sets in human and mouse, we identify hundreds of thousands of putative enhancers and their target genes, and compile an extensive genome-wide catalog of gene regulation in human and mouse. As we show, our predictions are highly enriched for ChIP-seq and DNA accessibility data, evolutionary conservation, eQTLs and other DNA-DNA interaction data.
Keyphrases
- genome wide
- circulating tumor
- cell free
- single molecule
- electronic health record
- dna methylation
- big data
- endothelial cells
- gene expression
- circulating tumor cells
- single cell
- transcription factor
- randomized controlled trial
- stem cells
- systematic review
- machine learning
- artificial intelligence
- copy number
- data analysis
- mental health
- multidrug resistant
- mesenchymal stem cells
- rna seq
- high density