Login / Signup

A unified local objective function for optimally selecting SNPs on arrays for agricultural genomics applications.

X-L WuH LiR FerrettiB SimpsonJ WalkerJ ParhamL MastroJ QiuT SchultzR G TaitS Bauck
Published in: Animal genetics (2020)
Over the years, ad-hoc procedures were used for designing SNP arrays, but the procedures and strategies varied considerably case by case. Recently, a multiple-objective, local optimization (MOLO) algorithm was proposed to select SNPs for SNP arrays, which maximizes the adjusted SNP information (E score) under multiple constraints, e.g. on MAF, uniformness of SNP locations (U score), the inclusion of obligatory SNPs and the number and size of gaps. In the MOLO, each chromosome is split into equally spaced segments and local optima are selected as the SNPs having the highest adjusted E score within each segment, conditional on the presence of obligatory SNPs. The computation of the adjusted E score, however, is empirical, and it does not scale well between the uniformness of SNP locations and SNP informativeness. In addition, the MOLO objective function does not accommodate the selection of uniformly distributed SNPs. In the present study, we proposed a unified local function for optimally selecting SNPs, as an amendment to the MOLO algorithm. This new local function takes scalable weights between the uniformness and informativeness of SNPs, which allows the selection of SNPs under varied scenarios. The results showed that the weighting between the U and the E scores led to a higher imputation concordance rate than the U score or E score alone. The results from the evaluation of six commercial bovine SNP chips further confirmed this conclusion.
Keyphrases
  • genome wide
  • dna methylation
  • copy number
  • high density
  • machine learning
  • climate change
  • genome wide association
  • deep learning
  • healthcare
  • risk assessment
  • health information
  • heavy metals
  • neural network
  • sewage sludge