Login / Signup

An integrated model for detecting significant chromatin interactions from high-resolution Hi-C data.

Mark CartyLee ZamparoMerve SahinAlvaro GonzálezRaphael PelossofOlivier ElementoChristina S Leslie
Published in: Nature communications (2017)
Here we present HiC-DC, a principled method to estimate the statistical significance (P values) of chromatin interactions from Hi-C experiments. HiC-DC uses hurdle negative binomial regression account for systematic sources of variation in Hi-C read counts-for example, distance-dependent random polymer ligation and GC content and mappability bias-and model zero inflation and overdispersion. Applied to high-resolution Hi-C data in a lymphoblastoid cell line, HiC-DC detects significant interactions at the sub-topologically associating domain level, identifying potential structural and regulatory interactions supported by CTCF binding sites, DNase accessibility, and/or active histone marks. CTCF-associated interactions are most strongly enriched in the middle genomic distance range (∼700 kb-1.5 Mb), while interactions involving actively marked DNase accessible elements are enriched both at short (<500 kb) and longer (>1.5 Mb) genomic distances. There is a striking enrichment of longer-range interactions connecting replication-dependent histone genes on chromosome 6, potentially representing the chromatin architecture at the histone locus body.
Keyphrases
  • high resolution
  • gene expression
  • transcription factor
  • genome wide
  • dna methylation
  • electronic health record
  • mass spectrometry
  • machine learning
  • immune response
  • big data
  • oxidative stress