Hazard ratio estimation and inference in clinical trials with many tied event times.

Published in: Statistics in medicine (2018)

The medical literature contains numerous examples of randomized clinical trials with time-to-event endpoints in which large numbers of events accrued over relatively short follow-up periods, resulting in many tied event times. A generally common feature across such examples was that the logrank test was used for hypothesis testing and the Cox proportional hazards model was used for hazard ratio estimation. We caution that this common practice is particularly risky in the setting of many tied event times for two reasons. First, the estimator of the hazard ratio can be severely biased if the Breslow tie-handling approximation for the Cox model (the default in SAS and Stata software) is used. Second, the 95% confidence interval for the hazard ratio can include one even when the corresponding logrank test p-value is less than 0.05. To help establish a better practice, with applicability for both superiority and noninferiority trials, we use theory and simulations to contrast Wald and score tests based on well-known tie-handling approximations for the Cox model. Our recommendation is to report the Wald test p-value and corresponding confidence interval based on the Efron approximation. The recommended test is essentially as powerful as the logrank test, the accompanying point and interval estimates of the hazard ratio have excellent statistical properties even in settings with many tied event times, inferential alignment between the p-value and confidence interval is guaranteed, and implementation is straightforward using commonly used software.

Keyphrases