A quantitative computational framework for allopolyploid single-cell data integration and core gene ranking in development.
Meiyue WangZijuan LiHaoyu WangKande LinShu-Song ZhengYilong FengWan TengYiping TongWenli ZhangChenghong LiuHong-Qing LingYue-Qing HuYi-Jing ZhangPublished in: Molecular biology and evolution (2024)
Polyploidization drives regulatory and phenotypic innovation. How the merger of different genomes contributes to polyploid development is a fundamental issue in evolutionary developmental biology and breeding research. Clarifying this issue is challenging because of genome complexity and the difficulty in tracking stochastic subgenome divergence during development. Recent single-cell sequencing techniques enabled probing subgenome divergent regulation in the context of cellular differentiation. However, analyzing single-cell data suffers from high error rates due to high-dimensionality, noise, and sparsity, and the errors stack up in polyploid analysis due to the increased dimensionality of comparisons between subgenomes of each cell, hindering deeper mechanistic understandings. Here, we developed a quantitative computational framework, pseudo-genome divergence quantification (pgDQ), for quantifying and tracking subgenome divergence directly at the cellular level. Further comparing with cellular differentiation trajectories derived from scRNA-seq data allowed for an examination of the relationship between subgenome divergence and the progression of development. pgDQ produces robust results and is insensitive to data dropout and noise, avoiding high error rates due to multiple comparisons of genes, cells, and subgenomes. A statistical diagonostic approach is proposed to identify genes that are central to subgenome divergence during development, which facilitates the integration of different data modalities, enabling the identification of factors and pathways that mediate subgenome-divergent activity during development. Case studies demonstrated that applying pgDQ to single cell and bulk tissue transcriptome data promotes a systematic and deeper understanding of how dynamic subgenome divergence contributes to developmental trajectories in polyploid evolution.