Machine Learning Electronic Health Record Identification of Patients with Rheumatoid Arthritis: Algorithm Pipeline Development and Validation Study.
Tjardo D MaarseveenTimo MeinderinkMarcel J T ReindersJohannes KnitzaThomas Wj HuizingaArnd KleyerDavid SimonErik Ben van den AkkerRachel KnevelPublished in: JMIR medical informatics (2020)
We demonstrate that machine learning methods can extract the records of patients with rheumatoid arthritis from electronic health record data with high precision, allowing research on very large populations for limited costs. Our approach is language and center independent and could be applied to any type of diagnosis. We have developed our pipeline into a universally applicable and easy-to-implement workflow to equip centers with their own high-performing algorithm. This allows the creation of observational studies of unprecedented size covering different countries for low cost from already available data in electronic health record systems.