Login / Signup

Correcting prevalence estimation for biased sampling with testing errors.

Lili ZhouDaniel Andrés Díaz-PachónChen ZhaoJ Sunil RaoOla Hössjer
Published in: Statistics in medicine (2023)
Sampling for prevalence estimation of infection is subject to bias by both oversampling of symptomatic individuals and error-prone tests. This results in naïve estimators of prevalence (ie, proportion of observed infected individuals in the sample) that can be very far from the true proportion of infected. In this work, we present a method of prevalence estimation that reduces both the effect of bias due to testing errors and oversampling of symptomatic individuals, eliminating it altogether in some scenarios. Moreover, this procedure considers stratified errors in which tests have different error rate profiles for symptomatic and asymptomatic individuals. This results in easily implementable algorithms, for which code is provided, that produce better prevalence estimates than other methods (in terms of reducing and/or removing bias), as demonstrated by formal results, simulations, and on COVID-19 data from the Israeli Ministry of Health.
Keyphrases
  • risk factors
  • public health
  • coronavirus disease
  • machine learning
  • emergency department
  • mental health
  • deep learning
  • adverse drug
  • big data
  • artificial intelligence
  • molecular dynamics
  • electronic health record