Login / Signup

Probabilistic embedding, clustering, and alignment for integrating spatial transcriptomics data with PRECAST.

Wei LiuXu LiaoZiye LuoYi YangMai Chan LauYuling JiaoXingjie ShiWeiwei ZhaiHongkai JiJoe Poh Sheng YeongJin Liu
Published in: Nature communications (2023)
Spatially resolved transcriptomics involves a set of emerging technologies that enable the transcriptomic profiling of tissues with the physical location of expressions. Although a variety of methods have been developed for data integration, most of them are for single-cell RNA-seq datasets without consideration of spatial information. Thus, methods that can integrate spatial transcriptomics data from multiple tissue slides, possibly from multiple individuals, are needed. Here, we present PRECAST, a data integration method for multiple spatial transcriptomics datasets with complex batch effects and/or biological effects between slides. PRECAST unifies spatial factor analysis simultaneously with spatial clustering and embedding alignment, while requiring only partially shared cell/domain clusters across datasets. Using both simulated and four real datasets, we show improved cell/domain detection with outstanding visualization, and the estimated aligned embeddings and cell/domain labels facilitate many downstream analyses. We demonstrate that PRECAST is computationally scalable and applicable to spatial transcriptomics datasets from different platforms.
Keyphrases
  • single cell
  • rna seq
  • high throughput
  • electronic health record
  • big data
  • gene expression
  • stem cells
  • mental health
  • healthcare
  • physical activity
  • machine learning
  • deep learning