Login / Signup

Modal clustering of matrix-variate data.

Federico FerraccioliGiovanna Menardi
Published in: Advances in data analysis and classification (2022)
The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distributions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix-variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.
Keyphrases
  • electronic health record
  • big data
  • minimally invasive
  • data analysis
  • drug delivery
  • machine learning
  • molecular dynamics
  • cross sectional
  • monte carlo