Optimization of training sets for genomic prediction of early-stage single crosses in maize.
Dnyaneshwar C KadamOscar R RodriguezAaron J LorenzPublished in: TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik (2021)
Training population optimization algorithms are useful for efficiently training genomic prediction models for single-cross performance, especially if the population is extended beyond only realized crosses to all possible single crosses. Genomic prediction of single-cross performance could allow effective evaluation of all possible single crosses between all inbreds developed in a hybrid breeding program. The objectives of the present study were to investigate the effect of different levels of relatedness on genomic predictive ability of single crosses, evaluate the usefulness of deterministic formula to forecast prediction accuracy in advance, and determine the potential for TRS optimization based on prediction error variance (PEVmean) and coefficient of determination (CDmean) criteria. We used 481 single crosses made by crossing 89 random recombinant inbred lines (RILs) belonging to the Iowa stiff stalk synthetic group with 103 random RILs belonging to the non-stiff stalk synthetic heterotic group. As expected, predictive ability was enhanced by ensuring close relationships between TRSs and target sets, even when TRS sizes were smaller. We found that designing a TRS based on PEVmean or CDmean criteria is useful for increasing the efficiency of genomic prediction of maize single crosses. We went further and extended the sampling space from that of all observed single crosses to all possible single crosses, providing a much larger genetic space within which to design a training population. Using all possible single crosses increased the advantage of the PEVmean and CDmean methods based on expected prediction accuracy. This finding suggests that it may be worthwhile using an optimization algorithm to select a training population from all possible single crosses to maximize efficiency in training accurate models for hybrid genomic prediction.