An end-to-end deep learning method for mass spectrometry data analysis to reveal disease-specific metabolic profiles.
Yongjie DengYao YaoYanni WangTiantian YuWenhao CaiDingli ZhouFeng YinWan-Li LiuYuying LiuChuanbo XieJian GuanYumin HuPeng HuangWeizhong LiPublished in: Nature communications (2024)
Untargeted metabolomic analysis using mass spectrometry provides comprehensive metabolic profiling, but its medical application faces challenges of complex data processing, high inter-batch variability, and unidentified metabolites. Here, we present DeepMSProfiler, an explainable deep-learning-based method, enabling end-to-end analysis on raw metabolic signals with output of high accuracy and reliability. Using cross-hospital 859 human serum samples from lung adenocarcinoma, benign lung nodules, and healthy individuals, DeepMSProfiler successfully differentiates the metabolomic profiles of different groups (AUC 0.99) and detects early-stage lung adenocarcinoma (accuracy 0.961). Model flow and ablation experiments demonstrate that DeepMSProfiler overcomes inter-hospital variability and effects of unknown metabolites signals. Our ensemble strategy removes background-category phenomena in multi-classification deep-learning models, and the novel interpretability enables direct access to disease-related metabolite-protein networks. Further applying to lipid metabolomic data unveils correlations of important metabolites and proteins. Overall, DeepMSProfiler offers a straightforward and reliable method for disease diagnosis and mechanism discovery, enhancing its broad applicability.
Keyphrases
- deep learning
- mass spectrometry
- data analysis
- convolutional neural network
- early stage
- artificial intelligence
- ms ms
- healthcare
- liquid chromatography
- machine learning
- big data
- electronic health record
- high performance liquid chromatography
- capillary electrophoresis
- gene expression
- emergency department
- high throughput
- squamous cell carcinoma
- adverse drug
- amino acid
- dna methylation
- lymph node
- drug induced
- protein protein
- atrial fibrillation