Login / Signup

Exploring the Privacy-Preserving Properties of Word Embeddings: Algorithmic Validation Study.

Mohamed AbdallaMoustafa AbdallaGraeme HirstFrank Rudzicz
Published in: Journal of medical Internet research (2020)
Special care must be taken when sharing word embeddings created from clinical texts, as current approaches may compromise patient privacy. If PHI removal is used for anonymization before traditional word embeddings are trained, it is possible to attribute sensitive information to patients who have not been fully deidentified by the (necessarily imperfect) removal algorithms. A promising alternative (ie, anonymization by PHI replacement) may avoid these flaws. Our results are timely and critical, as an increasing number of researchers are pushing for publicly available health data.
Keyphrases