Login / Signup

Significance testing and genomic inflation factor using high-density genotypes or whole-genome sequence data.

Sanne van den BergJérémie VandenplasFred A van EeuwijkMarcos S LopesRoel F Veerkamp
Published in: Journal of animal breeding and genetics = Zeitschrift fur Tierzuchtung und Zuchtungsbiologie (2019)
Significance testing for genome-wide association study (GWAS) with increasing SNP density up to whole-genome sequence data (WGS) is not straightforward, because of strong LD between SNP and population stratification. Therefore, the objective of this study was to investigate genomic control and different significance testing procedures using data from a commercial pig breeding scheme. A GWAS was performed in GCTA with data of 4,964 Large White pigs using medium density, high density or imputed whole-genome sequence data, fitting a genomic relationship matrix based on a leave-one-chromosome-out approach to account for population structure. Subsequently, genomic inflation factors were assessed on whole-genome level and the chromosome level. To establish a significance threshold, permutation testing, Bonferroni corrections using either the total number of SNPs or the number of independent chromosome fragments, and false discovery rates (FDR) using either the Benjamini-Hochberg procedure or the Benjamini and Yekutieli procedure were evaluated. We found that genomic inflation factors did not differ between different density genotypes but do differ between chromosomes. Also, the leave-one-chromosome-out approach for GWAS or using the pedigree relationships did not account appropriately for population stratification and gave strong genomic inflation. Regarding different procedures for significance testing, when the aim is to find QTL regions that are associated with a trait of interest, we recommend applying the FDR following the Benjamini and Yekutieli approach to establish a significance threshold that is adjusted for multiple testing. When the aim is to pinpoint a specific mutation, the more conservative Bonferroni correction based on the total number of SNPs is more appropriate, till an appropriate method is established to adjust for the number of independent tests.
Keyphrases
  • high density
  • copy number
  • genome wide
  • electronic health record
  • big data
  • genome wide association study
  • dna methylation
  • minimally invasive
  • machine learning
  • high throughput
  • data analysis
  • amino acid
  • deep learning