A Prediction Model of Autism Spectrum Diagnosis from Well-Baby Electronic Data Using Machine Learning.
Ayelet Ben-SassonJoshua GuedaliaLiat NativKeren IlanMeirav ShahamLidia V GabisPublished in: Children (Basel, Switzerland) (2024)
Early detection of autism spectrum disorder (ASD) is crucial for timely intervention, yet diagnosis typically occurs after age three. This study aimed to develop a machine learning model to predict ASD diagnosis using infants' electronic health records obtained through a national screening program and evaluate its accuracy. A retrospective cohort study analyzed health records of 780,610 children, including 1163 with ASD diagnoses. Data encompassed birth parameters, growth metrics, developmental milestones, and familial and post-natal variables from routine wellness visits within the first two years. Using a gradient boosting model with 3-fold cross-validation, 100 parameters predicted ASD diagnosis with an average area under the ROC curve of 0.86 (SD < 0.002). Feature importance was quantified using the Shapley Additive explanation tool. The model identified a high-risk group with a 4.3-fold higher ASD incidence (0.006) compared to the cohort (0.001). Key predictors included failing six milestones in language, social, and fine motor domains during the second year, male gender, parental developmental concerns, non-nursing, older maternal age, lower gestational age, and atypical growth percentiles. Machine learning algorithms capitalizing on preventative care electronic health records can facilitate ASD screening considering complex relations between familial and birth factors, post-natal growth, developmental parameters, and parent concern.
Keyphrases
- autism spectrum disorder
- electronic health record
- machine learning
- gestational age
- intellectual disability
- attention deficit hyperactivity disorder
- healthcare
- birth weight
- clinical decision support
- quality improvement
- mental health
- big data
- south africa
- randomized controlled trial
- artificial intelligence
- early onset
- adverse drug
- risk factors
- working memory
- pregnant women
- body mass index
- data analysis