Mutual Information as a Performance Measure for Binary Predictors Characterized by Both ROC Curve and PROC Curve Analysis.

Gareth HughesJennifer KopetzkyNeil McRoberts

Published in: Entropy (Basel, Switzerland) (2020)

The predictive receiver operating characteristic (PROC) curve differs from the more well-known receiver operating characteristic (ROC) curve in that it provides a basis for the evaluation of binary diagnostic tests using metrics defined conditionally on the outcome of the test rather than metrics defined conditionally on the actual disease status. Application of PROC curve analysis may be hindered by the complex graphical patterns that are sometimes generated. Here we present an information theoretic analysis that allows concurrent evaluation of PROC curves and ROC curves together in a simple graphical format. The analysis is based on the observation that mutual information may be viewed both as a function of ROC curve summary statistics (sensitivity and specificity) and prevalence, and as a function of predictive values and prevalence. Mutual information calculated from a 2 × 2 prediction-realization table for a specified risk score threshold on an ROC curve is the same as the mutual information calculated at the same risk score threshold on a corresponding PROC curve. Thus, for a given value of prevalence, the risk score threshold that maximizes mutual information is the same on both the ROC curve and the corresponding PROC curve. Phytopathologists and clinicians who have previously relied solely on ROC curve summary statistics when formulating risk thresholds for application in practical agricultural or clinical decision-making contexts are thus presented with a methodology that brings predictive values within the scope of that formulation.

Keyphrases