Login / Signup

The Kendall interaction filter for variable interaction screening in high dimensional classification problems.

Youssef AnzarmouAbdallah MkhadriKarim Oualkacha
Published in: Journal of applied statistics (2022)
Accounting for important interaction effects can improve the prediction of many statistical learning models. Identification of relevant interactions, however, is a challenging issue owing to their ultrahigh-dimensional nature. Interaction screening strategies can alleviate such issues. However, due to heavier tail distribution and complex dependence structure of interaction effects, innovative robust and/or model-free methods for screening interactions are required to better scale analysis of complex and high-throughput data. In this work, we develop a new model-free interaction screening method, termed Kendall Interaction Filter (KIF), for the classification in high-dimensional settings. KIF method suggests a weighted-sum measure, which compares the overall to the within-cluster Kendall's τ of pairs of predictors, to select interactive couples of features. The proposed KIF measure captures relevant interactions for the clusters response-variable, handles continuous, categorical or a mixture of continuous-categorical features, and is invariant under monotonic transformations. The tKIF measure enjoys the sure screening property in the high-dimensional setting under mild conditions, without imposing sub-exponential moment assumptions on the features' distribution. We illustrate the favorable behavior of the proposed methodology compared to the methods in the same category using simulation studies, and we conduct real data analyses to demonstrate its utility.
Keyphrases
  • high throughput
  • machine learning
  • deep learning
  • magnetic resonance
  • electronic health record
  • big data