VAR(1) based models do not always outpredict AR(1) models in typical psychological applications.

Kirsten Bulteel Merijn Mestdagh Francis Tuerlinckx Eva Ceulemans

Published in: Psychological methods (2018)

In psychology, modeling multivariate dynamical processes within a person is gaining ground. A popular model is the lag-one vector autoregressive or VAR(1) model and its variants, in which each variable is regressed on all variables (including itself) at the previous time point. Many parameters have to be estimated in the VAR(1) model, however. The question thus rises whether the VAR(1) model is not too complex and overfits the data. If the latter is the case, the estimated model will not properly predict new unseen data. As a consequence, it cannot be trusted that the estimated parameters adequately characterize the individual from which the data at hand were sampled. In this article, we evaluate for current psychological applications whether the VAR(1) model outpredicts simpler models, using cross-validation (CV) techniques to determine the predictive accuracy. As it is unclear whether one should use standard CV techniques (leave-one-out CV or K-fold CV) or variants that take time dependence into account (blocked CV, hv-block CV, or accumulated prediction errors), we first compare the relative performance of these five CV techniques in a simulation study. The simulation settings mimic the data characteristics of current psychological VAR(1) applications and show that blocked CV has the best performance in general. Subsequently, we use blocked CV to assess to what extent the VAR(1) models predict unseen data for three recent psychological applications. We show that the VAR(1) based models do not outperform the AR(1) based ones for the three presented psychological applications. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

Keyphrases