Linking gene expression to clinical outcomes in pediatric Crohn's disease using machine learning.
Kevin A ChenNina C NishiyamaMeaghan M Kennedy NgAlexandria ShumwayChinmaya U JoisaMatthew R SchanerGrace LianCaroline BeasleyLee-Ching ZhuSurekha BantumilliMuneera R KapadiaShawn M GomezTerrence S FureyShehzad Z SheikhPublished in: Scientific reports (2024)
Pediatric Crohn's disease (CD) is characterized by a severe disease course with frequent complications. We sought to apply machine learning-based models to predict risk of developing future complications in pediatric CD using ileal and colonic gene expression. Gene expression data was generated from 101 formalin-fixed, paraffin-embedded (FFPE) ileal and colonic biopsies obtained from treatment-naïve CD patients and controls. Clinical outcomes including development of strictures or fistulas and progression to surgery were analyzed using differential expression and modeled using machine learning. Differential expression analysis revealed downregulation of pathways related to inflammation and extra-cellular matrix production in patients with strictures. Machine learning-based models were able to incorporate colonic gene expression and clinical characteristics to predict outcomes with high accuracy. Models showed an area under the receiver operating characteristic curve (AUROC) of 0.84 for strictures, 0.83 for remission, and 0.75 for surgery. Genes with potential prognostic importance for strictures (REG1A, MMP3, and DUOX2) were not identified in single gene differential analysis but were found to have strong contributions to predictive models. Our findings in FFPE tissue support the importance of colonic gene expression and the potential for machine learning-based models in predicting outcomes for pediatric CD.
Keyphrases
- gene expression
- machine learning
- dna methylation
- ulcerative colitis
- minimally invasive
- big data
- end stage renal disease
- artificial intelligence
- chronic kidney disease
- oxidative stress
- type diabetes
- genome wide identification
- coronary artery disease
- risk factors
- adipose tissue
- electronic health record
- human health
- drug induced
- early onset
- rheumatoid arthritis
- transcription factor
- risk assessment
- systemic lupus erythematosus
- metabolic syndrome
- young adults
- skeletal muscle
- percutaneous coronary intervention
- cell migration
- glycemic control
- data analysis