A new goodness-of-fit measure for probit models: Surrogate R 2 .
Dungang LiuXiaorui ZhuBrandon GreenwellZewei LinPublished in: The British journal of mathematical and statistical psychology (2022)
Probit models are used extensively for inferential purposes in the social sciences as discrete data are prevalent in a vast body of social studies. Among many accompanying model inference problems, a critical question remains unsettled: how to develop a goodness-of-fit measure that resembles the ordinary least square (OLS) R 2 used for linear models. Such a measure has long been sought to achieve 'comparability' of different empirical models across multiple samples addressing similar social questions. To this end, we propose a novel R 2 measure for probit models using the notion of surrogacy - simulating a continuous variable S as a surrogate of the original discrete response (Liu & Zhang, Journal of the American Statistical Association, 113, 845 and 2018). The proposed R 2 is the proportion of the variance of the surrogate response explained by explanatory variables through a linear model, and we call it a surrogate R 2 . This paper shows both theoretically and numerically that the surrogate R 2 approximates the OLS R 2 based on the latent continuous variable, preserves the interpretation of explained variation, and maintains monotonicity between nested models. As no other pseudo R 2 , McKelvey and Zavoina's and McFadden's included, can meet all the three criteria simultaneously, our measure fills this crucial void in probit model inference.