Login / Signup

Representational Rényi Heterogeneity.

Abraham NunesMartin AldaTimothy BardouilleThomas Trappenberg
Published in: Entropy (Basel, Switzerland) (2020)
A discrete system's heterogeneity is measured by the Rényi heterogeneity family of indices (also known as Hill numbers or Hannah-Kay indices), whose units are the numbers equivalent. Unfortunately, numbers equivalent heterogeneity measures for non-categorical data require a priori (A) categorical partitioning and (B) pairwise distance measurement on the observable data space, thereby precluding application to problems with ill-defined categories or where semantically relevant features must be learned as abstractions from some data. We thus introduce representational Rényi heterogeneity (RRH), which transforms an observable domain onto a latent space upon which the Rényi heterogeneity is both tractable and semantically relevant. This method requires neither a priori binning nor definition of a distance function on the observable space. We show that RRH can generalize existing biodiversity and economic equality indices. Compared with existing indices on a beta-mixture distribution, we show that RRH responds more appropriately to changes in mixture component separation and weighting. Finally, we demonstrate the measurement of RRH in a set of natural images, with respect to abstract representations learned by a deep neural network. The RRH approach will further enable heterogeneity measurement in disciplines whose data do not easily conform to the assumptions of existing indices.
Keyphrases
  • single cell
  • electronic health record
  • big data
  • neural network
  • machine learning
  • data analysis