Evaluation of data processing pipelines on real-world electronic health records data for the purpose of measuring patient similarity.
Maria PikoulaConstantinos KallisSephora MadjiheuremJennifer Kathleen QuintMona BafadhelSpiros DenaxasPublished in: PloS one (2023)
Data transformation has downstream and unforeseen consequences in cluster analysis. Rather than viewing this process as a black box, we have shown ways to quantitatively and qualitatively evaluate and select the appropriate preprocessing pipeline.