Login / Signup

Origins of East Caucasus Gene Pool: Contributions of Autochthonous Bronze Age Populations and Migrations from West Asia Estimated from Y-Chromosome Data.

Anastasia AgdzhoyanNasib IskandarovGeorgy PonomarevVladimir PylevSergey KoshelVugar SalaevElvira PocheshkhovaZhaneta KagazezhevaElena Balanovska
Published in: Genes (2023)
The gene pool of the East Caucasus, encompassing modern-day Azerbaijan and Dagestan populations, was studied alongside adjacent populations using 83 Y-chromosome SNP markers. The analysis of genetic distances among 18 populations ( N = 2216) representing Nakh-Dagestani, Altaic, and Indo-European language families revealed the presence of three components (Steppe, Iranian, and Dagestani) that emerged in different historical periods. The Steppe component occurs only in Karanogais, indicating a recent medieval migration of Turkic-speaking nomads from the Eurasian steppe. The Iranian component is observed in Azerbaijanis, Dagestani Tabasarans, and all Iranian-speaking peoples of the Caucasus. The Dagestani component predominates in Dagestani-speaking populations, except for Tabasarans, and in Turkic-speaking Kumyks. Each component is associated with distinct Y-chromosome haplogroup complexes: the Steppe includes C-M217, N-LLY22g, R1b-M73, and R1a-M198; the Iranian includes J2-M172(×M67, M12) and R1b-M269; the Dagestani includes J1-Y3495 lineages. We propose J1-Y3495 haplogroup's most common lineage originated in an autochthonous ancestral population in central Dagestan and splits up ~6 kya into J1-ZS3114 (Dargins, Laks, Lezgi-speaking populations) and J1-CTS1460 (Avar-Andi-Tsez linguistic group). Based on the archeological finds and DNA data, the analysis of J1-Y3495 phylogeography suggests the growth of the population in the territory of modern-day Dagestan that started in the Bronze Age, its further dispersal, and the microevolution of the diverged population.
Keyphrases
  • copy number
  • mitochondrial dna
  • genome wide
  • genetic diversity
  • electronic health record
  • machine learning
  • single molecule
  • data analysis
  • genome wide identification