A novel blood-based epigenetic biosignature in first-episode schizophrenia patients through automated machine learning.
Makrina KaraglaniAgorastos AgorastosMaria PanagopoulouEleni ParlapaniPanagiotis AthanasisPanagiotis BitsiosKonstantina TzitzikouTheodosis TheodosiouIoannis IliopoulosVasilios-Panteleimon BozikasEkaterini ChatzakiPublished in: Translational psychiatry (2024)
Schizophrenia (SCZ) is a chronic, severe, and complex psychiatric disorder that affects all aspects of personal functioning. While SCZ has a very strong biological component, there are still no objective diagnostic tests. Lately, special attention has been given to epigenetic biomarkers in SCZ. In this study, we introduce a three-step, automated machine learning (AutoML)-based, data-driven, biomarker discovery pipeline approach, using genome-wide DNA methylation datasets and laboratory validation, to deliver a highly performing, blood-based epigenetic biosignature of diagnostic clinical value in SCZ. Publicly available blood methylomes from SCZ patients and healthy individuals were analyzed via AutoML, to identify SCZ-specific biomarkers. The methylation of the identified genes was then analyzed by targeted qMSP assays in blood gDNA of 30 first-episode drug-naïve SCZ patients and 30 healthy controls (CTRL). Finally, AutoML was used to produce an optimized disease-specific biosignature based on patient methylation data combined with demographics. AutoML identified a SCZ-specific set of novel gene methylation biomarkers including IGF2BP1, CENPI, and PSME4. Functional analysis investigated correlations with SCZ pathology. Methylation levels of IGF2BP1 and PSME4, but not CENPI were found to differ, IGF2BP1 being higher and PSME4 lower in the SCZ group as compared to the CTRL group. Additional AutoML classification analysis of our experimental patient data led to a five-feature biosignature including all three genes, as well as age and sex, that discriminated SCZ patients from healthy individuals [AUC 0.755 (0.636, 0.862) and average precision 0.758 (0.690, 0.825)]. In conclusion, this three-step pipeline enabled the discovery of three novel genes and an epigenetic biosignature bearing potential value as promising SCZ blood-based diagnostics.
Keyphrases
- dna methylation
- genome wide
- machine learning
- end stage renal disease
- newly diagnosed
- ejection fraction
- gene expression
- deep learning
- high throughput
- peritoneal dialysis
- bipolar disorder
- small molecule
- mental health
- mass spectrometry
- big data
- working memory
- emergency department
- case report
- copy number
- high speed
- drug delivery
- genome wide identification
- adverse drug
- neural network