Login / Signup

Comparisons of statistical methods for handling attrition in a follow-up visit with complex survey sampling.

Jianwen CaiDonglin ZengHaolin LiNicole M ButeraPedro L BaldoniPoulami MaitraLi Dong
Published in: Statistics in medicine (2023)
Design-based analysis, which accounts for the design features of the study, is commonly used to conduct data analysis in studies with complex survey sampling, such as the Hispanic Community Health Study/Study of Latinos (HCHS/SOL). In this type of longitudinal study, attrition has often been a problem. Although there have been various statistical approaches proposed to handle attrition, such as inverse probability weighting (IPW), non-response cell weighting (NRCW), multiple imputation (MI), and full information maximum likelihood (FIML) approach, there has not been a systematic assessment of these methods to compare their performance in design-based analyses. In this article, we perform extensive simulation studies and compare the performance of different missing data methods in linear and generalized linear population models, and under different missing data mechanism. We find that the design-based analysis is able to produce valid estimation and statistical inference when the missing data are handled appropriately using IPW, NRCW, MI, or FIML approach under missing-completely-at-random or missing-at-random missing mechanism and when the missingness model is correctly specified or over-specified. We also illustrate the use of these methods using data from HCHS/SOL.
Keyphrases
  • data analysis
  • electronic health record
  • stem cells
  • machine learning
  • social media
  • health information
  • mesenchymal stem cells