Login / Signup

The variation and evolution of complete human centromeres.

Glennis A LogsdonAllison N RozanskiFedor D RyabovTamara A PotapovaValery A ShepelevClaudia Rita CatacchioDavid PorubskyYafei MaoDongAhn YooMikko RautiainenSergey KorenSergey NurkJulian K LucasKendra HoekzemaKatherine M MunsonJennifer L GertonAdam M PhillippyMario VenturaIvan A AlexandrovEvan E Eichler
Published in: Nature (2024)
Human centromeres have been traditionally very difficult to sequence and assemble owing to their repetitive nature and large size 1 . As a result, patterns of human centromeric variation and models for their evolution and function remain incomplete, despite centromeres being among the most rapidly mutating regions 2,3 . Here, using long-read sequencing, we completely sequenced and assembled all centromeres from a second human genome and compared it to the finished reference genome 4,5 . We find that the two sets of centromeres show at least a 4.1-fold increase in single-nucleotide variation when compared with their unique flanks and vary up to 3-fold in size. Moreover, we find that 45.8% of centromeric sequence cannot be reliably aligned using standard methods owing to the emergence of new α-satellite higher-order repeats (HORs). DNA methylation and CENP-A chromatin immunoprecipitation experiments show that 26% of the centromeres differ in their kinetochore position by >500 kb. To understand evolutionary change, we selected six chromosomes and sequenced and assembled 31 orthologous centromeres from the common chimpanzee, orangutan and macaque genomes. Comparative analyses reveal a nearly complete turnover of α-satellite HORs, with characteristic idiosyncratic changes in α-satellite HORs for each species. Phylogenetic reconstruction of human haplotypes supports limited to no recombination between the short (p) and long (q) arms across centromeres and reveals that novel α-satellite HORs share a monophyletic origin, providing a strategy to estimate the rate of saltatory amplification and mutation of human centromeric DNA.
Keyphrases
  • endothelial cells
  • dna methylation
  • genome wide
  • induced pluripotent stem cells
  • pluripotent stem cells
  • gene expression
  • transcription factor
  • body composition
  • postmenopausal women
  • nucleic acid
  • circulating tumor