A process to deduplicate individuals for regional chronic disease prevalence estimates using a distributed data network of electronic health records.

Kenneth A ScottSara Deakyne DaviesRachel ZuckerToan OngEmily McCormick KrausMichael G KahnJessica BondyMatt F DaleyKate HorleEmily BaconLisa SchillingTessa CrumeRomana Hasnain-WyniaSeth FoldyGregory BudneyArthur J Davidson

Published in: Learning health systems (2021)

We implemented an HIE-dependent, extensible process that deduplicates individuals for less biased prevalence estimates in a DDN. Our null pilot findings have limited generalizability. Overlap was small and likely insufficient to influence prevalence estimates. Other factors, including the number and size of partners, the matching algorithm, and the electronic phenotype may influence the degree of deduplication bias. Additional use cases may help improve understanding of duplication bias and reveal other principles and insights. This study informed how DDNs could support learning health systems' response to public health challenges and improve regional health.

Keyphrases