Evaluating three strategies of genome-wide association analysis for integrating data from multiple populations.
Zhanming ZhongGuang-Zhen LiZhiting XuHaonan ZengJinyan TengXueyan FengShuqi DiaoYahui GaoJiaqi LiZhe ZhangPublished in: Animal genetics (2024)
In livestock, genome-wide association studies (GWAS) are usually conducted in a single population (single-GWAS) with limited sample size and detection power. To enhance the detection power of GWAS, meta-analysis of GWAS (meta-GWAS) and mega-analysis of GWAS (mega-GWAS) have been proposed to integrate data from multiple populations at the level of summary statistics or individual data, respectively. However, there is a lack of comparison for these different strategies, which makes it difficult to guide the best practice of GWAS integrating data from multiple study populations. To maximize the comparison of different association analysis strategies across multiple populations, we conducted single-GWAS, meta-GWAS, and mega-GWAS for the backfat thickness of 100 kg (BFT_100) and days to 100 kg (DAYS_100) within each of the three commercial pig breeds (Duroc, Yorkshire, and Landrace). Based on controlling the genome inflation factor to one, we calculated corrected p-values (p C ). In Yorkshire, with the largest sample size, mega-GWAS, meta-GWAS and single-GWAS detected 149, 38 and 20 significant SNPs (p C < 1E-5) associated with BFT_100, as well as 26, four, and one QTL, respectively. Among them, p C of SNPs from mega-GWAS was the lowest, followed by meta-GWAS and single-GWAS. The correlation of p C among the three GWAS strategies ranged from 0.60 to 0.75 and the correlation of SNP effect values between meta-GWAS and mega-GWAS was 0.74, all showing good agreement. Collectively, even though there are differences in the integration of individual data or summary statistics, integrating data from multiple populations is an effective means of genetic argument for complex traits, especially mega-GWAS versus single-GWAS.