Sparse Multicategory Generalized Distance Weighted Discrimination in Ultra-High Dimensions.
Tong SuYafei WangYi LiuWilliam G BrantonEugene AsahchopChristopher PowerBei JiangLinglong KongNiansheng TangPublished in: Entropy (Basel, Switzerland) (2020)
Distance weighted discrimination (DWD) is an appealing classification method that is capable of overcoming data piling problems in high-dimensional settings. Especially when various sparsity structures are assumed in these settings, variable selection in multicategory classification poses great challenges. In this paper, we propose a multicategory generalized DWD (MgDWD) method that maintains intrinsic variable group structures during selection using a sparse group lasso penalty. Theoretically, we derive minimizer uniqueness for the penalized MgDWD loss function and consistency properties for the proposed classifier. We further develop an efficient algorithm based on the proximal operator to solve the optimization problem. The performance of MgDWD is evaluated using finite sample simulations and miRNA data from an HIV study.
Keyphrases
- machine learning
- deep learning
- high resolution
- electronic health record
- big data
- magnetic resonance
- antiretroviral therapy
- mental health
- neural network
- hiv infected
- contrast enhanced
- human immunodeficiency virus
- network analysis
- hiv positive
- hiv testing
- hepatitis c virus
- molecular dynamics
- hiv aids
- computed tomography
- men who have sex with men
- monte carlo