Hi-C analysis: from data generation to integration.
Koustav PalMattia ForcatoFrancesco FerrariPublished in: Biophysical reviews (2018)
In the epigenetics field, large-scale functional genomics datasets of ever-increasing size and complexity have been produced using experimental techniques based on high-throughput sequencing. In particular, the study of the 3D organization of chromatin has raised increasing interest, thanks to the development of advanced experimental techniques. In this context, Hi-C has been widely adopted as a high-throughput method to measure pairwise contacts between virtually any pair of genomic loci, thus yielding unprecedented challenges for analyzing and handling the resulting complex datasets. In this review, we focus on the increasing complexity of available Hi-C datasets, which parallels the adoption of novel protocol variants. We also review the complexity of the multiple data analysis steps required to preprocess Hi-C sequencing reads and extract biologically meaningful information. Finally, we discuss solutions for handling and visualizing such large genomics datasets.
Keyphrases
- single cell
- rna seq
- data analysis
- high throughput
- high throughput sequencing
- genome wide
- copy number
- electronic health record
- randomized controlled trial
- gene expression
- oxidative stress
- dna damage
- transcription factor
- dna methylation
- health information
- healthcare
- machine learning
- big data
- deep learning
- anti inflammatory
- single molecule
- genome wide association study
- fluorescent probe