Pipeline for the Rapid Development of Cytogenetic Markers Using Genomic Data of Related Species.
Pavel Yu KroupinVictoria KuznetsovaDmitry RomanovAlina KocheshkovaGennady I KarlovThi Xuan DangThi Mai L KhuatIlya KirovOleg AlexandrovAlexander PolkhovskiyOlga RazumovaMikhail G DivashukPublished in: Genes (2019)
Repetitive DNA including tandem repeats (TRs) is a significant part of most eukaryotic genomes. TRs include rapidly evolving satellite DNA (satDNA) that can be shared by closely related species, their abundance may be associated with evolutionary divergence, and they have been widely used for chromosome karyotyping using fluorescence in situ hybridization (FISH). The recent progress in the development of whole-genome sequencing and bioinformatics tools enables rapid and cost-effective searches for TRs including satDNA that can be converted into molecular cytogenetic markers. In the case of closely related taxa, the genome sequence of one species (donor) can be used as a base for the development of chromosome markers for related species or genomes (target). Here, we present a pipeline for rapid and high-throughput screening for new satDNA TRs in whole-genome sequencing of the donor genome and the development of chromosome markers based on them that can be applied in the target genome. One of the main peculiarities of the developed pipeline is that preliminary estimation of TR abundance using qPCR and ranking found TRs according to their copy number in the target genome; it facilitates the selection of the most prospective (most abundant) TRs that can be converted into cytogenetic markers. Another feature of our pipeline is the probe preparation for FISH using PCR with primers designed on the aligned TR unit sequences and the genomic DNA of a target species as a template that enables amplification of a whole pool of monomers inherent in the chromosomes of the target species. We demonstrate the efficiency of the developed pipeline by the example of FISH probes developed for A, B, and R subgenome chromosomes of hexaploid triticale (BBAARR) based on a bioinformatics analysis of the D genome of Aegilops tauschii (DD) whole-genome sequence. Our pipeline can be used to develop chromosome markers in closely related species for comparative cytogenetics in evolutionary and breeding studies.