Identify gestational diabetes mellitus by deep learning model from cell-free DNA at the early gestation stage.
Yipeng WangPei SunZicheng ZhaoYousheng YanWentao YueKai YangRuixia LiuHui HuangYinan WangYin ChenNan LiHailong FengJing LiYifan LiuYujiao ChenBai-Rong ShenLijian ZhaoChenghong YinPublished in: Briefings in bioinformatics (2024)
Gestational diabetes mellitus (GDM) is a common complication of pregnancy, which has significant adverse effects on both the mother and fetus. The incidence of GDM is increasing globally, and early diagnosis is critical for timely treatment and reducing the risk of poor pregnancy outcomes. GDM is usually diagnosed and detected after 24 weeks of gestation, while complications due to GDM can occur much earlier. Copy number variations (CNVs) can be a possible biomarker for GDM diagnosis and screening in the early gestation stage. In this study, we proposed a machine-learning method to screen GDM in the early stage of gestation using cell-free DNA (cfDNA) sequencing data from maternal plasma. Five thousand and eighty-five patients from north regions of Mainland China, including 1942 GDM, were recruited. A non-overlapping sliding window method was applied for CNV coverage screening on low-coverage (~0.2×) sequencing data. The CNV coverage was fed to a convolutional neural network with attention architecture for the binary classification. The model achieved a classification accuracy of 88.14%, precision of 84.07%, recall of 93.04%, F1-score of 88.33% and AUC of 96.49%. The model identified 2190 genes associated with GDM, including DEFA1, DEFA3 and DEFB1. The enriched gene ontology (GO) terms and KEGG pathways showed that many identified genes are associated with diabetes-related pathways. Our study demonstrates the feasibility of using cfDNA sequencing data and machine-learning methods for early diagnosis of GDM, which may aid in early intervention and prevention of adverse pregnancy outcomes.
Keyphrases
- pregnancy outcomes
- machine learning
- deep learning
- pregnant women
- copy number
- convolutional neural network
- big data
- preterm infants
- early stage
- gestational age
- artificial intelligence
- genome wide
- electronic health record
- type diabetes
- mitochondrial dna
- single cell
- randomized controlled trial
- cardiovascular disease
- emergency department
- dna methylation
- preterm birth
- newly diagnosed
- squamous cell carcinoma
- birth weight
- gene expression
- adipose tissue
- chronic kidney disease
- prognostic factors
- skeletal muscle
- patient reported outcomes
- high throughput
- drug induced
- combination therapy
- ionic liquid
- sentinel lymph node