Login / Signup

Post-selection inference in regression models for group testing data.

Qinyan ShenKarl B GregoryXianzheng Huang
Published in: Biometrics (2024)
We develop a methodology for valid inference after variable selection in logistic regression when the responses are partially observed, that is, when one observes a set of error-prone testing outcomes instead of the true values of the responses. Aiming at selecting important covariates while accounting for missing information in the response data, we apply the expectation-maximization algorithm to compute maximum likelihood estimators subject to LASSO penalization. Subsequent to variable selection, we make inferences on the selected covariate effects by extending post-selection inference methodology based on the polyhedral lemma. Empirical evidence from our extensive simulation study suggests that our post-selection inference results are more reliable than those from naive inference methods that use the same data to perform variable selection and inference without adjusting for variable selection.
Keyphrases
  • single cell
  • electronic health record
  • type diabetes
  • machine learning
  • big data
  • healthcare
  • hiv infected
  • data analysis
  • antiretroviral therapy
  • glycemic control