A common resequencing-based genetic marker data set for global maize diversity.
Marcin W GrzybowskiRavi V MuralGen XuJonathan TurkusJinliang YangOsler A OrtezPublished in: The Plant journal : for cell and molecular biology (2023)
Maize (Zea mays ssp. mays) populations exhibit vast ranges of genetic and phenotypic diversity. As sequencing costs have declined, an increasing number of projects have sought to measure genetic differences between and within maize populations using whole-genome resequencing strategies, identifying millions of segregating single-nucleotide polymorphisms (SNPs) and insertions/deletions (InDels). Unlike older genotyping strategies like microarrays and genotyping by sequencing, resequencing should, in principle, frequently identify and score common genetic variants. However, in practice, different projects frequently employ different analytical pipelines, often employ different reference genome assemblies and consistently filter for minor allele frequency within the study population. This constrains the potential to reuse and remix data on genetic diversity generated from different projects to address new biological questions in new ways. Here, we employ resequencing data from 1276 previously published maize samples and 239 newly resequenced maize samples to generate a single unified marker set of approximately 366 million segregating variants and approximately 46 million high-confidence variants scored across crop wild relatives, landraces as well as tropical and temperate lines from different breeding eras. We demonstrate that the new variant set provides increased power to identify known causal flowering-time genes using previously published trait data sets, as well as the potential to track changes in the frequency of functionally distinct alleles across the global distribution of modern maize.