Improving Inverse Probability Weighting by Post-calibrating Its Propensity Scores.

Published in: Epidemiology (Cambridge, Mass.) (2024)

Theoretical guarantees for causal inference using propensity scores are partially based on the scores behaving like conditional probabilities. However, scores between zero and one do not necessarily behave like probabilities, especially when output by flexible statistical estimators. We perform a simulation study to assess the error in estimating the average treatment effect before and after applying a simple and well-established postprocessing method to calibrate the propensity scores. We observe that postcalibration reduces the error in effect estimation and that larger improvements in calibration result in larger improvements in effect estimation. Specifically, we find that expressive tree-based estimators, which are often less calibrated than logistic regression-based models initially, tend to show larger improvements relative to logistic regression-based models. Given the improvement in effect estimation and that postcalibration is computationally cheap, we recommend its adoption when modeling propensity scores with expressive models.

Keyphrases

combination therapy