BAC-End Sequence-Based SNP Mining in Allotetraploid Cotton (Gossypium) Utilizing Resequencing Data, Phylogenetic Inferences, and Perspectives for Genetic Mapping.
Amanda M Hulse-KempHamid AshrafiDavid M StellyXiuting ZhengChristopher A SaskiBrian E SchefflerDavid D FangZ Jeffrey ChenAllen Van DeynzeDavid M StellyPublished in: G3 (Bethesda, Md.) (2015)
A bacterial artificial chromosome library and BAC-end sequences for cultivated cotton (Gossypium hirsutum L.) have recently been developed. This report presents genome-wide single nucleotide polymorphism (SNP) mining utilizing resequencing data with BAC-end sequences as a reference by alignment of 12 G. hirsutum L. lines, one G. barbadense L. line, and one G. longicalyx Hutch and Lee line. A total of 132,262 intraspecific SNPs have been developed for G. hirsutum, whereas 223,138 and 470,631 interspecific SNPs have been developed for G. barbadense and G. longicalyx, respectively. Using a set of interspecific SNPs, 11 randomly selected and 77 SNPs that are putatively associated with the homeologous chromosome pair 12 and 26, we mapped 77 SNPs into two linkage groups representing these chromosomes, spanning a total of 236.2 cM in an interspecific F2 population (G. barbadense 3-79 × G. hirsutum TM-1). The mapping results validated the approach for reliably producing large numbers of both intraspecific and interspecific SNPs aligned to BAC-ends. This will allow for future construction of high-density integrated physical and genetic maps for cotton and other complex polyploid genomes. The methods developed will allow for future Gossypium resequencing data to be automatically genotyped for identified SNPs along the BAC-end sequence reference for anchoring sequence assemblies and comparative studies.
Keyphrases
- genome wide
- dna methylation
- copy number
- high density
- genome wide analysis
- genome wide identification
- electronic health record
- big data
- gene expression
- transcription factor
- physical activity
- mental health
- current status
- mass spectrometry
- data analysis
- machine learning
- human immunodeficiency virus
- deep learning
- case control