The spike-and-slab lasso and scalable algorithm to accommodate multinomial outcomes in variable selection problems.
Justin M LeachNengjun YiInmaculada B Abannull The Alzheimer's Disease Neuroimaging InitiativePublished in: Journal of applied statistics (2023)
Spike-and-slab prior distributions are used to impose variable selection in Bayesian regression-style problems with many possible predictors. These priors are a mixture of two zero-centered distributions with differing variances, resulting in different shrinkage levels on parameter estimates based on whether they are relevant to the outcome. The spike-and-slab lasso assigns mixtures of double exponential distributions as priors for the parameters. This framework was initially developed for linear models, later developed for generalized linear models, and shown to perform well in scenarios requiring sparse solutions. Standard formulations of generalized linear models cannot immediately accommodate categorical outcomes with > 2 categories, i.e. multinomial outcomes, and require modifications to model specification and parameter estimation. Such modifications are relatively straightforward in a Classical setting but require additional theoretical and computational considerations in Bayesian settings, which can depend on the choice of prior distributions for the parameters of interest. While previous developments of the spike-and-slab lasso focused on continuous, count, and/or binary outcomes, we generalize the spike-and-slab lasso to accommodate multinomial outcomes, developing both the theoretical basis for the model and an expectation-maximization algorithm to fit the model. To our knowledge, this is the first generalization of the spike-and-slab lasso to allow for multinomial outcomes.