Post model-fitting exploration via a "Next-Door" analysis.
Leying GuanRobert TibshiraniPublished in: The Canadian journal of statistics = Revue canadienne de statistique (2020)
We propose a simple method for evaluating the model that has been chosen by an adaptive regression procedure, our main focus being the lasso. This procedure deletes each chosen predictor and refits the lasso to get a set of models that are "close" to the chosen "base model," and compares the error rates of the base model with that of nearby models. If the deletion of a predictor leads to significant deterioration in the model's predictive power, the predictor is called indispensable; otherwise, the nearby model is called acceptable and can serve as a good alternative to the base model. This provides both an assessment of the predictive contribution of each variable and a set of alternative models that may be used in place of the chosen model. We call this procedure "Next-Door analysis" since it examines models "next" to the base model. It can be applied to supervised learning problems with ℓ 1 penalization and stepwise procedures. We have implemented it in the R language as a library to accompany the well-known glmnet library.