Login / Signup

scHi-CSim: a flexible simulator that generates high-fidelity single-cell Hi-C data for benchmarking.

Shichen FanDachang DangYusen YeShao-Wu ZhangLin GaoShihua Zhang
Published in: Journal of molecular cell biology (2023)
Single-cell Hi-C technology provides an unprecedented opportunity to reveal chromatin structure in individual cells. However, high sequencing cost impedes the generation of biological Hi-C data with high sequencing depths and multiple replicates for downstream analysis. Here we developed a single-cell Hi-C simulator (scHi-CSim) that generates high-fidelity data for benchmarking. scHi-CSim merges neighboring cells to overcome the sparseness of data, samples interactions in distance-stratified chromosomes to maintain the heterogeneity of single cells, and estimates the empirical distribution of restriction fragments to generate simulated data. We demonstrated that scHi-CSim can generate high-fidelity data by comparing the performance of single-cell clustering and detection of chromosomal high-order structures with raw data. Furthermore, scHi-CSim is flexible to change sequencing depths and the number of simulated replicates. We showed that increasing sequencing depths could improve the accuracy of detecting topologically associating domains. We also used scHi-CSim to generate a series of simulated datasets with different sequencing depths to benchmark single-cell Hi-C clustering methods.
Keyphrases