Cross-continental admixture in the Kho population from northwest Pakistan.
Asifullah KhanLeonardo ValliniShahid AzizHizbullah KhanKomal ZaibKiran NigarQasim AyubLing-Xiang WangLuca PaganiShao-Qing WenPublished in: European journal of human genetics : EJHG (2022)
Northern Pakistan is home to many diverse ethnicities and languages. The region acted as a prime corridor for ancient invasions and population migrations between Western Eurasia and South Asia. Kho, one of the major ethnic groups living in this region, resides in the remote and isolated mountainous region in the Chitral Valley of the Hindu Kush Mountain range. They are culturally and linguistically distinct from the rest of the Pakistani population groups and their genetic ancestry is still unknown. In this study, we generated genome-wide genotype data of ~1 M loci (Illumina WeGene array) for 116 unrelated Kho individuals and carried out comprehensive analyses in the context of worldwide extant and ancient anatomically modern human populations across Eurasia. The results inferred that the Kho can trace a large proportion of their ancestry to the population who migrated south from the Southern Siberian steppes during the second millennium BCE ~110 generations ago. An additional wave of gene flow from a population carrying East Asian ancestry was also identified in the Kho that occurred ~60 generations ago and may possibly be linked to the expansion of the Tibetan Empire during 7th to 9th centuries CE (current era) in the northwestern regions of the Indian sub-continent. We identified several candidate regions suggestive of positive selection in the Kho, that included genes mainly involved in pigmentation, immune responses, muscular development, DNA repair, and tumor suppression.
Keyphrases
- genome wide
- dna repair
- immune response
- dna methylation
- healthcare
- endothelial cells
- genome wide association study
- south africa
- dendritic cells
- inflammatory response
- dna damage response
- toll like receptor
- high resolution
- mass spectrometry
- big data
- risk assessment
- machine learning
- genome wide identification
- single cell
- cord blood