Pinpointing Genomic Regions and Candidate Genes Associated with Seed Oil and Protein Content in Soybean through an Integrative Transcriptomic and QTL Meta-Analysis.
Virender KumarVinod GoyalRushil MandlikSurbhi KumawatSreeja SudhakaranGunashri PadalkarNitika RanaRupesh Kailasrao DeshmukhJoy RoyTilak Raj SharmaHumira SonahPublished in: Cells (2022)
Soybean with enriched nutrients has emerged as a prominent source of edible oil and protein. In the present study, a meta-analysis was performed by integrating quantitative trait loci (QTLs) information, region-specific association and transcriptomic analysis. Analysis of about a thousand QTLs previously identified in soybean helped to pinpoint 14 meta-QTLs for oil and 16 meta-QTLs for protein content. Similarly, region-specific association analysis using whole genome re-sequenced data was performed for the most promising meta-QTL on chromosomes 6 and 20. Only 94 out of 468 genes related to fatty acid and protein metabolic pathways identified within the meta-QTL region were found to be expressed in seeds. Allele mining and haplotyping of these selected genes were performed using whole genome resequencing data. Interestingly, a significant haplotypic association of some genes with oil and protein content was observed, for instance, in the case of FAD2-1B gene, an average seed oil content of 20.22% for haplotype 1 compared to 15.52% for haplotype 5 was observed. In addition, the mutation S86F in the FAD2-1B gene produces a destabilizing effect of (ΔΔG Stability) -0.31 kcal/mol. Transcriptomic analysis revealed the tissue-specific expression of candidate genes. Based on their higher expression in seed developmental stages, genes such as sugar transporter, fatty acid desaturase (FAD), lipid transporter, major facilitator protein and amino acid transporter can be targeted for functional validation. The approach and information generated in the present study will be helpful in the map-based cloning of regulatory genes, as well as for marker-assisted breeding in soybean.
Keyphrases
- fatty acid
- genome wide
- amino acid
- genome wide identification
- binding protein
- protein protein
- systematic review
- poor prognosis
- dna methylation
- randomized controlled trial
- gene expression
- transcription factor
- healthcare
- bioinformatics analysis
- small molecule
- genome wide analysis
- high density
- machine learning
- data analysis