Systematic benchmarking of single-cell ATAC-sequencing protocols.
Florian V De RopGert HulselmansChristopher FlerinPaula Soler-VilaAlbert RafelsValerie ChristiaensCarmen Bravo González-BlasDomenica MarcheseGinevra CaratùSuresh PoovathingalOrit Rozenblatt-RosenMichael SlyperWendy LuoChristoph MuusFabiana M DuarteRojesh ShresthaS Tansu BagdatliM Ryan CorcesLira MamanovaAndrew KnightsKerstin B MeyerRyan M MulqueenAkram TaherinasabPatrick MaschmeyerJoern PezoldtCamille Lucie Germaine LambertMarta IglesiasSebastián R NajleZain Y DossaniLuciano G MartelottoZachary Daniel BurkettRonald LebofskyJosé Ignacio Martin-SuberoSatish PillaiArnau Sebé-PedrósBart DeplanckeSarah A TeichmannLeif S LudwigTheodore P BraunAndrew C AdeyWilliam J GreenleafJason D BuenrostroAviv RegevStein AertsHolger HeynPublished in: Nature biotechnology (2023)
Single-cell assay for transposase-accessible chromatin by sequencing (scATAC-seq) has emerged as a powerful tool for dissecting regulatory landscapes and cellular heterogeneity. However, an exploration of systemic biases among scATAC-seq technologies has remained absent. In this study, we benchmark the performance of eight scATAC-seq methods across 47 experiments using human peripheral blood mononuclear cells (PBMCs) as a reference sample and develop PUMATAC, a universal preprocessing pipeline, to handle the various sequencing data formats. Our analyses reveal significant differences in sequencing library complexity and tagmentation specificity, which impact cell-type annotation, genotype demultiplexing, peak calling, differential region accessibility and transcription factor motif enrichment. Our findings underscore the importance of sample extraction, method selection, data processing and total cost of experiments, offering valuable guidance for future research. Finally, our data and analysis pipeline encompasses 169,000 PBMC scATAC-seq profiles and a best practices code repository for scATAC-seq data analysis, which are freely available to extend this benchmarking effort to future protocols.