Development and multi-site external validation of a generalizable risk prediction model for bipolar disorder.
Colin G WalshMichael A RippergerYirui HuYi-Han SheuHyunjoon LeeDrew WilimitisAmanda B ZheutlinDaniel RochaKarmel W ChoiVictor M CastroH Lester KirchnerChristopher F ChabrisLea K DavisJordan W SmollerPublished in: Translational psychiatry (2024)
Bipolar disorder is a leading contributor to disability, premature mortality, and suicide. Early identification of risk for bipolar disorder using generalizable predictive models trained on diverse cohorts around the United States could improve targeted assessment of high risk individuals, reduce misdiagnosis, and improve the allocation of limited mental health resources. This observational case-control study intended to develop and validate generalizable predictive models of bipolar disorder as part of the multisite, multinational PsycheMERGE Network across diverse and large biobanks with linked electronic health records (EHRs) from three academic medical centers: in the Northeast (Massachusetts General Brigham), the Mid-Atlantic (Geisinger) and the Mid-South (Vanderbilt University Medical Center). Predictive models were developed and valid with multiple algorithms at each study site: random forests, gradient boosting machines, penalized regression, including stacked ensemble learning algorithms combining them. Predictors were limited to widely available EHR-based features agnostic to a common data model including demographics, diagnostic codes, and medications. The main study outcome was bipolar disorder diagnosis as defined by the International Cohort Collection for Bipolar Disorder, 2015. In total, the study included records for 3,529,569 patients including 12,533 cases (0.3%) of bipolar disorder. After internal and external validation, algorithms demonstrated optimal performance in their respective development sites. The stacked ensemble achieved the best combination of overall discrimination (AUC = 0.82-0.87) and calibration performance with positive predictive values above 5% in the highest risk quantiles at all three study sites. In conclusion, generalizable predictive models of risk for bipolar disorder can be feasibly developed across diverse sites to enable precision medicine. Comparison of a range of machine learning methods indicated that an ensemble approach provides the best performance overall but required local retraining. These models will be disseminated via the PsycheMERGE Network website.
Keyphrases
- bipolar disorder
- major depressive disorder
- machine learning
- electronic health record
- mental health
- healthcare
- end stage renal disease
- type diabetes
- deep learning
- chronic kidney disease
- cardiovascular disease
- ejection fraction
- climate change
- artificial intelligence
- drug delivery
- coronary artery disease
- cardiovascular events
- body composition
- risk factors
- mental illness
- prognostic factors