Login / Signup

Inferring the population history of Tai-Kadai-speaking people and southernmost Han Chinese on Hainan Island by genome-wide array genotyping.

Guanglin HeZheng WangJianxin GuoMengge WangXing ZouRenkuan TangJing LiuHan ZhangYingxiang LiRong HuLan-Hai WeiGang ChenChuan-Chao WangYiping Hou
Published in: European journal of human genetics : EJHG (2020)
Hainan Island, located between East Asia and Southeast Asia, represents an ideal region for the study of the genetic architecture of geographically isolated populations. However, the genetic structure and demographic history of the indigenous Tai-Kadai-speaking Hlai people and recent expanded southernmost Han Chinese on this island are poorly characterized due to a lack of genetic data. Thus, we collected and genotyped 36 Qiongzhong Hlai and 48 Haikou Han individuals at 497,637 single nucleotide polymorphisms (SNPs). We applied principal component analysis, ADMIXTURE, symmetrical D-statistics, admixture-f3 statistics, qpWave, and qpAdm analysis to infer the population history. Our results revealed the East Asian populations are characterized by a north-south genetic cline with Hlai at the southernmost end. We have not detected recent gene flow from neighboring populations into Hlai, therefore, we used Hlai as an unadmixed proxy to model the admixture history of mainland Tai-Kadai-speaking populations and southern Han Chinese. The mainland Tai-Kadai-speaking populations are suggested deriving a larger number of their ancestry from Hlai-related lineage, but also having admixture from South Asian-related or other neighboring populations. The Hlai group is also suggested to contribute about half of the ancestry to Han Chinese in Hainan. The complex patterns of genetic structure in East Asia were shaped via language categories, geographical boundaries, and large southward population movements with language dispersal and agriculture propagation.
Keyphrases
  • genome wide
  • dna methylation
  • copy number
  • genetic diversity
  • autism spectrum disorder
  • high throughput
  • electronic health record
  • single cell
  • deep learning
  • genome wide association study