Machine Learning Applications for Mass Spectrometry-Based Metabolomics.
Ulf W LiebalAn N T PhanMalvika SudhakarKarthik RamanLars M BlankPublished in: Metabolites (2020)
The metabolome of an organism depends on environmental factors and intracellular regulation and provides information about the physiological conditions. Metabolomics helps to understand disease progression in clinical settings or estimate metabolite overproduction for metabolic engineering. The most popular analytical metabolomics platform is mass spectrometry (MS). However, MS metabolome data analysis is complicated, since metabolites interact nonlinearly, and the data structures themselves are complex. Machine learning methods have become immensely popular for statistical analysis due to the inherent nonlinear data representation and the ability to process large and heterogeneous data rapidly. In this review, we address recent developments in using machine learning for processing MS spectra and show how machine learning generates new biological insights. In particular, supervised machine learning has great potential in metabolomics research because of the ability to supply quantitative predictions. We review here commonly used tools, such as random forest, support vector machines, artificial neural networks, and genetic algorithms. During processing steps, the supervised machine learning methods help peak picking, normalization, and missing data imputation. For knowledge-driven analysis, machine learning contributes to biomarker detection, classification and regression, biochemical pathway identification, and carbon flux determination. Of important relevance is the combination of different omics data to identify the contributions of the various regulatory levels. Our overview of the recent publications also highlights that data quality determines analysis quality, but also adds to the challenge of choosing the right model for the data. Machine learning methods applied to MS-based metabolomics ease data analysis and can support clinical decisions, guide metabolic engineering, and stimulate fundamental biological discoveries.
Keyphrases
- machine learning
- mass spectrometry
- big data
- data analysis
- artificial intelligence
- liquid chromatography
- electronic health record
- deep learning
- high resolution
- high performance liquid chromatography
- ms ms
- capillary electrophoresis
- gas chromatography
- multiple sclerosis
- healthcare
- neural network
- high throughput
- climate change
- reactive oxygen species
- molecularly imprinted
- molecular dynamics
- quantum dots
- solid phase extraction
- density functional theory