Machine Learning Analysis to Identify Data Entry Errors in Prehospital Patient Care Reports: A Case Study of a National Out-of-Hospital Cardiac Arrest Registry.
Dong Hyun ChoiJeong Ho ParkYoung Ho ChoiKyoung Jun SongSungwan KimSang Do ShinPublished in: Prehospital emergency care (2022)
Background: The objective of this study was to develop and validate machine learning models for data entry error detection in a national out-of-hospital cardiac arrest (OHCA) prehospital patient care report database. Methods: Adult OHCAs of presumed cardiac etiology were included. Data entry errors were defined as discrepancies between the coded data and the free-text note documenting the intervention or event; for example, information that was recorded as "absent" in the coded data but "present" in the free-text note. Machine learning models using the extreme gradient boosting, logistic regression, extreme gradient boosting outlier detection, and K-nearest neighbor outlier detection algorithms for error detection within nine core variables were developed and then validated for each variable. Results: Among 12,100 OHCAs, the proportion of cases with at least one error type was 16.2%. The area under the receiver operating characteristic curve (AUC) of the best-performing model (model with the highest AUC for each outcome variable) was 0.71-0.95. Machine learning models detected errors most efficiently for outcome place and initial rhythm errors; 82.6% of place errors and 93.8% of initial rhythm errors could be detected while checking 11 and 35% of data, respectively, compared to the strategy of checking all data. Conclusion: Machine learning models can detect data entry errors in care reports of emergency medical services (EMS) clinicians with acceptable performance and likely can improve the efficiency of the process of data quality control. EMS organizations that provide more prehospital interventions for OHCA patients could have higher error rates and may benefit from the adoption of error-detection models.
Keyphrases
- machine learning
- big data
- electronic health record
- adverse drug
- emergency medical
- healthcare
- artificial intelligence
- patient safety
- cardiac arrest
- loop mediated isothermal amplification
- data analysis
- end stage renal disease
- mental health
- quality control
- quality improvement
- ejection fraction
- newly diagnosed
- palliative care
- quantum dots
- chronic kidney disease
- left ventricular
- smoking cessation
- heart rate
- peritoneal dialysis