Login / Signup

Sparse Generative Topographic Mapping for Both Data Visualization and Clustering.

Hiromasa Kaneko
Published in: Journal of chemical information and modeling (2018)
To achieve simultaneous data visualization and clustering, the method of sparse generative topographic mapping (SGTM) is developed by modifying the conventional GTM algorithm. While the weight of each grid point is constant in the original GTM, it becomes a variable in the proposed SGTM, enabling data points to be clustered on two-dimensional maps. The appropriate number of clusters is determined by optimization based on the Bayesian information criterion. Analysis of numerical simulation data sets along with quantitative structure-property relationship and quantitative structure-activity relationship data sets confirmed that the proposed SGTM provides the same degree of visualization performance as the original GTM and clusters data points appropriately. Python and MATLAB codes for the proposed algorithm are available at https://github.com/hkaneko1985/gtm-generativetopographicmapping .
Keyphrases
  • electronic health record
  • big data
  • high resolution
  • single cell
  • healthcare
  • physical activity
  • structure activity relationship
  • rna seq
  • weight gain
  • body weight
  • high density