Login / Signup

Non-parametric inference about mean functionals of non-ignorable non-response data without identifying the joint distribution.

Wei LiWang MiaoEric Tchetgen Tchetgen
Published in: Journal of the Royal Statistical Society. Series B, Statistical methodology (2023)
We consider identification and inference about mean functionals of observed covariates and an outcome variable subject to non-ignorable missingness. By leveraging a shadow variable, we establish a necessary and sufficient condition for identification of the mean functional even if the full data distribution is not identified. We further characterize a necessary condition for n -estimability of the mean functional. This condition naturally strengthens the identifying condition, and it requires the existence of a function as a solution to a representer equation that connects the shadow variable to the mean functional. Solutions to the representer equation may not be unique, which presents substantial challenges for non-parametric estimation, and standard theories for non-parametric sieve estimators are not applicable here. We construct a consistent estimator of the solution set and then adapt the theory of extremum estimators to find from the estimated set a consistent estimator of an appropriately chosen solution. The estimator is asymptotically normal, locally efficient and attains the semi-parametric efficiency bound under certain regularity conditions. We illustrate the proposed approach via simulations and a real data application on home pricing.
Keyphrases
  • electronic health record
  • big data
  • single cell
  • healthcare
  • machine learning
  • bioinformatics analysis
  • artificial intelligence
  • data analysis
  • solid state