An organism-wide ATAC-seq peak catalogue for the bovine and its use to identify regulatory variants.
Can YuanLijing TangThomas John LopdellVyacheslav A PetrovClaire OgetGabriel Costa Monteiro MoreiraJosé Luis Gualdrón DuarteArnaud SarteletZhangrui ChengMazdak SalavatiClaire D WathesMark A Crowenull nullWouter CoppietersMathew LittlejohnCarole CharlierTom DruetMichel GeorgesHaruko TakedaPublished in: Genome research (2023)
We herein report the generation of an organism-wide catalogue of 976,813 cis -acting regulatory elements for the bovine detected by the Assay for Transposase Accessible Chromatin using sequencing (ATAC-Seq). We regroup these regulatory elements in 16 components by nonnegative matrix factorization. Correlations between the genome-wide density of peaks and transcription start sites, between peak accessibility and expression of neighboring genes, and enrichment in transcription factor binding motifs supports their regulatory potential. Using a previously established catalogue of 12,736,643 variants, we show that the proportion of single nucleotide polymorphisms mapping to ATAC-seq peaks is higher than expected and that this is due to an ~ 1.3-fold higher mutation rate within than outside peaks. Their site frequency spectrum indicates that variants in ATAC-seq peaks are subject to purifying selection. We generate eQTL datasets for liver and blood and show that variants that drive eQTL fall into liver and blood-specific ATAC-seq peaks more often than expected by chance. We combine ATAC-seq and eQTL data to estimate that the proportion of regulatory variants mapping to ATAC-seq peaks is approximately 1 in 3, and that the proportion of variants mapping to ATAC-seq peaks that are regulatory is approximately 1 in 25. We discuss the implication of these findings on the utility of ATAC-seq information to improve the accuracy of genomic selection.