A Genome-Wide Association Study Reveals Region Associated with Seed Protein Content in Cowpea.

Yilin ChenHaizheng XiongWaltram RavelombolaGehendra BhattaraiThomas Casey BarickmanIbtisam AlatawiTheresa Makawa PhiriKenani ChiwinaBeiquan MouShyamalrau P Tallury Ainong Shi

Published in: Plants (Basel, Switzerland) (2023)

Cowpea ( Vigna unguiculata L. Walp., 2 n = 2 x = 22) is a protein-rich crop that complements staple cereals for humans and serves as fodder for livestock. It is widely grown in Africa and other developing countries as the primary source of protein in the diet; therefore, it is necessary to identify the protein-related loci to improve cowpea breeding. In the current study, we conducted a genome-wide association study (GWAS) on 161 cowpea accessions (151 USDA germplasm plus 10 Arkansas breeding lines) with a wide range of seed protein contents (21.8~28.9%) with 110,155 high-quality whole-genome single-nucleotide polymorphisms (SNPs) to identify markers associated with protein content, then performed genomic prediction (GP) for future breeding. A total of seven significant SNP markers were identified using five GWAS models (single-marker regression (SMR), the general linear model (GLM), Mixed Linear Model (MLM), Fixed and Random Model Circulating Probability Unification (FarmCPU), and Bayesian-information and Linkage-disequilibrium Iteratively Nested Keyway (BLINK), which are located at the same locus on chromosome 8 for seed protein content. This locus was associated with the gene Vigun08g039200 , which was annotated as the protein of the thioredoxin superfamily, playing a critical function for protein content increase and nutritional quality improvement. In this study, a genomic prediction (GP) approach was employed to assess the accuracy of predicting seed protein content in cowpea. The GP was conducted using cross-prediction with five models, namely ridge regression best linear unbiased prediction (rrBLUP), Bayesian ridge regression (BRR), Bayesian A (BA), Bayesian B (BB), and Bayesian least absolute shrinkage and selection operator (BL), applied to seven random whole genome marker sets with different densities (10 k, 5 k, 2 k, 1 k, 500, 200, and 7), as well as significant markers identified through GWAS. The accuracies of the GP varied between 42.9% and 52.1% across the seven SNPs considered, depending on the model used. These findings not only have the potential to expedite the breeding cycle through early prediction of individual performance prior to phenotyping, but also offer practical implications for cowpea breeding programs striving to enhance seed protein content and nutritional quality.

Keyphrases