Login / Signup

Sparse Extended Redundancy Analysis: Variable Selection via the Exclusive LASSO.

Bing Cai KokJi Sok ChoiHyelim OhJi Yeh Choi
Published in: Multivariate behavioral research (2019)
Extended Redundancy Analysis is a statistical tool for exploring the directional relationships of multiple sets of exogenous variables on a set of endogenous variables. This approach posits that the endogenous and exogenous variables are related via latent components, each of which is extracted from a set of exogenous variables, that account for the maximum variation of the endogenous variables. However, it is often difficult to distinguish between the true variables that form the latent components and the false variables that do not, especially when the association between the true variables and the exogenous set is weak. To overcome this limitation, we propose a Sparse Extended Redundancy Analysis via the Exclusive LASSO that performs variable selection while maintaining model specification. We validate the performance of the proposed approach in a simulation study. Finally, the empirical utility of this approach is demonstrated through two examples-one on a study of youth academic achievement and the other on a text analysis of newspaper data.
Keyphrases
  • mental health
  • young adults
  • machine learning
  • deep learning