Profiling of Fusarium head blight resistance QTL haplotypes through molecular markers, genotyping-by-sequencing, and machine learning.
Zachary James WinnJeanette LyerlyBrian WardGina Brown-GuediraRichard E BoylesMohamed MergoumJerry JohnsonStephen HarrisonAli BabarRichard E MasonRussell SuttonJ Paul MurphyPublished in: TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik (2022)
Marker-assisted selection is important for cultivar development. We propose a system where a training population genotyped for QTL and genome-wide markers may predict QTL haplotypes in early development germplasm. Breeders screen germplasm with molecular markers to identify and select individuals that have desirable haplotypes. The objective of this research was to investigate whether QTL haplotypes can be accurately predicted using SNPs derived by genotyping-by-sequencing (GBS). In the SunGrains program during 2020 (SG20) and 2021 (SG21), 1,536 and 2,352 lines submitted for GBS were genotyped with markers linked to the Fusarium head blight QTL: Qfhb.nc-1A, Qfhb.vt-1B, Fhb1, and Qfhb.nc-4A. In parallel, data were compiled from the 2011-2020 Southern Uniform Winter Wheat Scab Nursery (SUWWSN), which had been screened for the same QTL, sequenced via GBS, and phenotyped for: visual Fusarium severity rating (SEV), percent Fusarium damaged kernels (FDK), deoxynivalenol content (DON), plant height, and heading date. Three machine learning models were evaluated: random forest, k-nearest neighbors, and gradient boosting machine. Data were randomly partitioned into training-testing splits. The QTL haplotype and 100 most correlated GBS SNPs were used for training and tuning of each model. Trained machine learning models were used to predict QTL haplotypes in the testing partition of SG20, SG21, and the total SUWWSN. Mean disease ratings for the observed and predicted QTL haplotypes were compared in the SUWWSN. For all models trained using the SG20 and SG21, the observed Fhb1 haplotype estimated group means for SEV, FDK, DON, plant height, and heading date in the SUWWSN were not significantly different from any of the predicted Fhb1 calls. This indicated that machine learning may be utilized in breeding programs to accurately predict QTL haplotypes in earlier generations.