Login / Signup

Differential Privacy Protections in 2020 U.S. Decennial Census Data Do Not Impede Measurement of Racial and Ethnic Disparities.

Joshua SnokeAnn HaasSteven C MartinoMarc N Elliott
Published in: Medical care research and review : MCRR (2024)
Census data are vital to health care research but must also protect respondents' confidentiality. The 2020 decennial Census employs a new Differential Privacy framework; this study examines its effect on the accuracy of an important tool for measuring health disparities, the Bayesian Improved Surname and Geocoding (BISG) algorithm, which uses Census Block Group data to estimate race and ethnicity when self-reported data are unavailable. Using self-reported race and ethnicity data as our standard, we compared the accuracy of BISG estimates calculated using the original 2010 Census counts to the accuracy of estimates calculated using 2010 data but with 2020 Differential Privacy in place. The Differential Privacy methodology slightly decreases BISG accuracy for American Indian and Alaska Native people but has little effect for other groups, suggesting that the methodology will not impede health disparities research that employs BISG and similar methods.
Keyphrases
  • big data
  • electronic health record
  • healthcare
  • health information
  • public health
  • machine learning
  • artificial intelligence
  • mental health
  • deep learning
  • risk assessment
  • health insurance