Login / Signup

Securing the use of existing sample collections for future human genetic research.

George KanoungiPeter NürnbergMichael Nothnagel
Published in: European journal of human genetics : EJHG (2017)
Hundreds of thousands of individuals have been genotyped in the past decades using genotyping arrays, representing both a valuable data resource for future biomedical research and a substantial investment in human genetic research. However, novel chip designs and their altered sets of single-nucleotide polymorphisms (SNPs) pose the question of how well established data resources, such as large samples of healthy controls genotyped on legacy arrays, can be combined with newer samples genotyped on those novel arrays using genotype imputation. We exemplarily investigated this question based on genotype data of 30 European and 30 African unrelated samples from the 1000 Genomes project and on markers present on two legacy SNP arrays, namely Affymetrix's Human SNP 6.0 and Illumina's 550k array, and three newer arrays, namely two Axiom arrays from Affymetrix and an OmniExpress array from Illumina. We cross-compared the imputation accuracy as well as efficacy and assessed genotype concordance among these arrays. Although the accuracy of genotype prediction was uniformly high across all arrays, the imputation efficacy, that is, the proportion of successfully imputed markers, differed considerably between array combinations in both sample sets, with legacy arrays showing a trend towards lower efficacy values compared with newer arrays when serving as imputation basis. We conclude that, given the substantial losses of markers covered by the legacy arrays, the re-genotyping of existing samples sets, in particular those of healthy population controls, would be a worthwhile endeavor to secure their continued use in the future.
Keyphrases