Nationwide prediction of type 2 diabetes comorbidities.
Piotr DworzynskiMartin AasbrennKlaus RostgaardMads MelbyeThomas Alexander GerdsHenrik HjalgrimTune H PersPublished in: Scientific reports (2020)
Identification of individuals at risk of developing disease comorbidities represents an important task in tackling the growing personal and societal burdens associated with chronic diseases. We employed machine learning techniques to investigate to what extent data from longitudinal, nationwide Danish health registers can be used to predict individuals at high risk of developing type 2 diabetes (T2D) comorbidities. Leveraging logistic regression-, random forest- and gradient boosting models and register data spanning hospitalizations, drug prescriptions and contacts with primary care contractors from >200,000 individuals newly diagnosed with T2D, we predicted five-year risk of heart failure (HF), myocardial infarction (MI), stroke (ST), cardiovascular disease (CVD) and chronic kidney disease (CKD). For HF, MI, CVD, and CKD, register-based models outperformed a reference model leveraging canonical individual characteristics by achieving area under the receiver operating characteristic curve improvements of 0.06, 0.03, 0.04, and 0.07, respectively. The top 1,000 patients predicted to be at highest risk exhibited observed incidence ratios exceeding 4.99, 3.52, 1.97 and 4.71 respectively. In summary, prediction of T2D comorbidities utilizing Danish registers led to consistent albeit modest performance improvements over reference models, suggesting that register data could be leveraged to systematically identify individuals at risk of developing disease comorbidities.
Keyphrases
- chronic kidney disease
- end stage renal disease
- type diabetes
- newly diagnosed
- cardiovascular disease
- heart failure
- primary care
- machine learning
- electronic health record
- big data
- peritoneal dialysis
- atrial fibrillation
- healthcare
- cross sectional
- public health
- acute heart failure
- glycemic control
- ejection fraction
- climate change
- artificial intelligence
- left ventricular
- risk factors
- emergency department
- prognostic factors
- risk assessment
- social media
- adverse drug
- data analysis
- patient reported
- deep learning
- skeletal muscle
- cerebral ischemia
- drug induced