Medical records-based chronic kidney disease phenotype for clinical care and "big data" observational and genetic studies.
Ning ShangAtlas KhanFernanda PolubriaginofFrancesca ZanoniKarla MehlDavid A FaselPaul E DrawzRobert J CarrolJoshua C DennyMatthew A HathcockAdelaide M Arruda-OlsonPeggy L PeissigRichard A DartMurray H BrilliantEric B LarsonDavid S CarrellSarah PendergrassShefali Setia VermaMarylyn DeRiggi RitchieBarbara BenoitVivian S GainerElizabeth W KarlsonAdam S GordonGail P JarvikIan B StanawayDavid R CrosslinSumit MohanIuliana Ionita-LazaNicholas P TatonettiAli G GharaviGeorge HripcsakChunhua WengKrzysztof KirylukPublished in: NPJ digital medicine (2021)
Chronic Kidney Disease (CKD) represents a slowly progressive disorder that is typically silent until late stages, but early intervention can significantly delay its progression. We designed a portable and scalable electronic CKD phenotype to facilitate early disease recognition and empower large-scale observational and genetic studies of kidney traits. The algorithm uses a combination of rule-based and machine-learning methods to automatically place patients on the staging grid of albuminuria by glomerular filtration rate ("A-by-G" grid). We manually validated the algorithm by 451 chart reviews across three medical systems, demonstrating overall positive predictive value of 95% for CKD cases and 97% for healthy controls. Independent case-control validation using 2350 patient records demonstrated diagnostic specificity of 97% and sensitivity of 87%. Application of the phenotype to 1.3 million patients demonstrated that over 80% of CKD cases are undetected using ICD codes alone. We also demonstrated several large-scale applications of the phenotype, including identifying stage-specific kidney disease comorbidities, in silico estimation of kidney trait heritability in thousands of pedigrees reconstructed from medical records, and biobank-based multicenter genome-wide and phenome-wide association studies.
Keyphrases
- chronic kidney disease
- end stage renal disease
- machine learning
- genome wide
- case control
- big data
- healthcare
- peritoneal dialysis
- ejection fraction
- artificial intelligence
- newly diagnosed
- randomized controlled trial
- deep learning
- prognostic factors
- dna methylation
- systematic review
- gene expression
- cross sectional
- clinical trial
- copy number
- case report
- lymph node
- pain management
- health insurance