Login / Signup

Application of Natural Language Processing in Electronic Health Record Data Extraction for Navigating Prostate Cancer Care: A Narrative Review.

Ansh BhatiaRenil TitusJoao Gabriel PortoJonathan KatzDiana M LopateguiRobert MarcovichDipen J ParekhHemendra Navinchandra Shah
Published in: Journal of endourology (2024)
Introduction: Natural language processing (NLP)-based data extraction from electronic health records (EHRs) holds significant potential to simplify clinical management and aid research. This review aims to evaluate the current landscape of NLP-based data extraction in prostate cancer (PCa) management. Materials and Methods: We conducted a literature search of PubMed and Google Scholar databases using the keywords: "Natural Language Processing," "Prostate Cancer," "data extraction," and "EHR" with variations of each. No language or time limits were imposed. All results were collected in a standardized manner, including country of origin, sample size, algorithm, objective of outcome, and model performance. The precision, recall, and the F1 score of studies were collected as a metric of model performance. Results: Of the 14 studies included in the review, 2 articles focused on documenting digital rectal examinations, 1 on identifying and quantifying pain secondary to PCa, 8 on extracting staging/grading information from clinical reports, with an emphasis on TNM-classification, risk stratification, and identifying metastasis, 2 articles focused on patient-centered post-treatment outcomes such as incontinence, erectile, and bowel dysfunction, and 1 on loneliness/social isolation following PCa diagnosis. All models showed moderate to high data annotation/extraction accuracy compared with the gold standard method of manual data extraction by chart review. Despite their potential, NLPs face challenges in handling ambiguous, institution-specific language and context nuances, leading to occasional inaccuracies in clinical data interpretation. Conclusion: NLP-based data extraction has effectively extracted various outcomes from PCa patients' EHRs. It holds the potential for automating outcome monitoring and data collection, resulting in time and labor savings.
Keyphrases