On the Distribution of Summary Statistics for Missing Data.

B M RinghamS M KreidlerK E MullerD H Glueck

Published in: Communications in statistics: theory and methods (2018)

Under an assumption that missing values occur randomly in a matrix, formulae are developed for the expected value and variance of six statistics that summarize the number and location of the missing values. For a seventh statistic, a regression model based on simulated data yields an estimate of the expected value. The results can be used in the development of methods to control the Type I error and approximate power and sample size for multilevel and longitudinal studies with missing data.

Keyphrases

electronic health record
big data
data analysis
cross sectional
deep learning