Login / Signup

Copula selection models for non-Gaussian outcomes that are missing not at random.

Manuel GomesRosalba RadiceJose Camarena BrenesGiampiero Marra
Published in: Statistics in medicine (2018)
Missing not at random (MNAR) data pose key challenges for statistical inference because the substantive model of interest is typically not identifiable without imposing further (eg, distributional) assumptions. Selection models have been routinely used for handling MNAR by jointly modeling the outcome and selection variables and typically assuming that these follow a bivariate normal distribution. Recent studies have advocated parametric selection approaches, for example, estimated by multiple imputation and maximum likelihood, that are more robust to departures from the normality assumption compared with those assuming that nonresponse and outcome are jointly normally distributed. However, the proposed methods have been mostly restricted to a specific joint distribution (eg, bivariate t-distribution). This paper discusses a flexible copula-based selection approach (which accommodates a wide range of non-Gaussian outcome distributions and offers great flexibility in the choice of functional form specifications for both the outcome and selection equations) and proposes a flexible imputation procedure that generates plausible imputed values from the copula selection model. A simulation study characterizes the relative performance of the copula model compared with the most commonly used selection models for estimating average treatment effects with MNAR data. We illustrate the methods in the REFLUX study, which evaluates the effect of laparoscopic surgery on long-term quality of life in patients with reflux disease. We provide software code for implementing the proposed copula framework using the R package GJRM.
Keyphrases
  • type diabetes
  • big data
  • adipose tissue
  • minimally invasive
  • single cell
  • insulin resistance
  • artificial intelligence