This paper presents methods to estimate the number of persons with HIV in North Carolina jails by applying finite population inferential approaches to data collected using web scraping and record linkage techniques. Administrative data are linked with web-scraped rosters of incarcerated persons in a nonrandom subset of counties. Outcome regression and calibration weighting are adapted for state-level estimation. Methods are compared in simulations and are applied to data from the US state of North Carolina. Outcome regression yielded more precise inference and allowed for county-level estimates, an important study objective, while calibration weighting exhibited double robustness under misspecification of the outcome or weight model.
Keyphrases
- hiv testing
- electronic health record
- antiretroviral therapy
- men who have sex with men
- hiv positive
- hiv infected
- human immunodeficiency virus
- big data
- hepatitis c virus
- hiv aids
- genome wide
- tertiary care
- physical activity
- molecular dynamics
- single cell
- gene expression
- low cost
- weight gain
- weight loss
- data analysis
- dna methylation
- south africa
- high density
- body weight