Analysis of an incomplete binary outcome dichotomized from an underlying continuous variable in clinical trials.

Published in: Pharmaceutical statistics (2022)

In many clinical trials, outcomes of interest are binary-valued. It is not uncommon that a binary-valued outcome is dichotomized from a continuous outcome at a threshold of clinical interest. To analyze such data, common approaches include (a) fitting a generalized linear mixed model (GLMM) to the dichotomized longitudinal binary outcome; and (b) the multiple imputation (MI) based method: imputing missing values in the continuous outcome, dichotomizing it into a binary outcome, and then fitting a generalized linear model to the "complete" data. We conducted comprehensive simulation studies to compare the performance of the GLMM versus the MI-based method for estimating the risk difference and the logarithm of odds ratio between two treatment arms at the end of study. In those simulation studies, we considered a range of multivariate distribution options for the continuous outcome (including a multivariate normal distribution, a multivariate t-distribution, a multivariate log-normal distribution, and the empirical distribution from a real clinical trial data) to evaluate the robustness of the estimators to various data-generating models. Simulation results demonstrate that both methods work well under those considered distribution options, but the MI-based method is more efficient with smaller mean squared errors compared to the GLMM. We further applied both the GLMM and the MI-based method to 29 phase 3 diabetes clinical trials, and found that the MI-based method generally led to smaller variance estimates compared to the GLMM.

Keyphrases