Login / Signup

Visualizing structure and transitions in high-dimensional biological data.

Kevin R MoonDavid van DijkZheng WangScott A GiganteDaniel B BurkhardtWilliam S ChenKristina YimAntonia van den ElzenMatthew J HirnRonald R CoifmanNatalia B IvanovaGuy WolfSmita Krishnaswamy
Published in: Nature biotechnology (2019)
The high-dimensional data created by high-throughput technologies require visualization tools that reveal data structure and patterns in an intuitive form. We present PHATE, a visualization method that captures both local and global nonlinear structure using an information-geometric distance between data points. We compare PHATE to other tools on a variety of artificial and biological datasets, and find that it consistently preserves a range of patterns in data, including continual progressions, branches and clusters, better than other tools. We define a manifold preservation metric, which we call denoised embedding manifold preservation (DEMaP), and show that PHATE produces lower-dimensional embeddings that are quantitatively better denoised as compared to existing visualization methods. An analysis of a newly generated single-cell RNA sequencing dataset on human germ-layer differentiation demonstrates how PHATE reveals unique biological insight into the main developmental branches, including identification of three previously undescribed subpopulations. We also show that PHATE is applicable to a wide variety of data types, including mass cytometry, single-cell RNA sequencing, Hi-C and gut microbiome data.
Keyphrases
  • single cell
  • electronic health record
  • high throughput
  • rna seq
  • big data
  • healthcare
  • endothelial cells
  • data analysis
  • gene expression
  • artificial intelligence
  • single molecule
  • induced pluripotent stem cells