Predicting childhood and adolescent attention-deficit/hyperactivity disorder onset: a nationwide deep learning approach.
Miguel Garcia-ArgibayYanli Zhang-JamesSamuele CortesePaul LichtensteinHenrik LarssonStephen V FaraonePublished in: Molecular psychiatry (2022)
Attention-deficit/hyperactivity disorder (ADHD) is a heterogeneous disorder with a high degree of psychiatric and physical comorbidity, which complicates its diagnosis in childhood and adolescence. We analyzed registry data from 238,696 persons born and living in Sweden between 1995 and 1999. Several machine learning techniques were used to assess the ability of registry data to inform the diagnosis of ADHD in childhood and adolescence: logistic regression, random Forest, gradient boosting, XGBoost, penalized logistic regression, deep neural network (DNN), and ensemble models. The best fitting model was the DNN, achieving an area under the receiver operating characteristic curve of 0.75, 95% CI (0.74-0.76) and balanced accuracy of 0.69. At the 0.45 probability threshold, sensitivity was 71.66% and specificity was 65.0%. There was an overall agreement in the feature importance among all models (τ > .5). The top 5 features contributing to classification were having a parent with criminal convictions, male sex, having a relative with ADHD, number of academic subjects failed, and speech/learning disabilities. A DNN model predicting childhood and adolescent ADHD trained exclusively on Swedish register data achieved good discrimination. If replicated and validated in an external sample, and proven to be cost-effective, this model could be used to alert clinicians to individuals who ought to be screened for ADHD and to aid clinicians' decision-making with the goal of decreasing misdiagnoses. Further research is needed to validate results in different populations and to incorporate new predictors.
Keyphrases
- attention deficit hyperactivity disorder
- machine learning
- deep learning
- neural network
- autism spectrum disorder
- childhood cancer
- mental health
- big data
- working memory
- young adults
- electronic health record
- early life
- decision making
- artificial intelligence
- depressive symptoms
- palliative care
- convolutional neural network
- cross sectional
- data analysis
- hearing loss