Causal estimands and confidence intervals associated with Wilcoxon-Mann-Whitney tests in randomized experiments.

Michael P FayErica H BrittainJoanna H ShihDean A FollmannErin E Gabriel

Published in: Statistics in medicine (2018)

Although the P value from a Wilcoxon-Mann-Whitney test is used often with randomized experiments, it is rarely accompanied with a causal effect estimate and its confidence interval. The natural parameter for the Wilcoxon-Mann-Whitney test is the Mann-Whitney parameter, ϕ, which measures the probability that a randomly selected individual in the treatment arm will have a larger response than a randomly selected individual in the control arm (plus an adjustment for ties). We show that the Mann-Whitney parameter may be framed as a causal parameter and show that it is not equal to a closely related and nonidentifiable causal effect, ψ, the probability that a randomly selected individual will have a larger response under treatment than under control (plus an adjustment for ties). We review the paradox, first expressed by Hand, that the ψ parameter may imply that the treatment is worse (or better) than control, while the Mann-Whitney parameter shows the opposite. Unlike the Mann-Whitney parameter, ψ is nonidentifiable from a randomized experiment. We review some nonparametric assumptions that rule out Hand's paradox through bounds on ψ and use bootstrap methods to make inferences on those bounds. We explore the relationship of the proportional odds parameter to Hand's paradox, showing that the paradox may occur for proportional odds parameters between 1/9 and 9. Thus, large effects are needed to ensure that if treatment appears better by the Mann-Whitney parameter, then treatment improves responses in most individuals. We demonstrate these issues using a vaccine trial.

Keyphrases