Fast and automated biomarker detection in breath samples with machine learning.
Angelika SkaryszDahlia SalmanMichael EddlestonMartin SykoraEugénie HunsickerWilliam H NailonKareen DarnleyDuncan B McLarenC L Paul ThomasAndrea SoltoggioPublished in: PloS one (2022)
Volatile organic compounds (VOCs) in human breath can reveal a large spectrum of health conditions and can be used for fast, accurate and non-invasive diagnostics. Gas chromatography-mass spectrometry (GC-MS) is used to measure VOCs, but its application is limited by expert-driven data analysis that is time-consuming, subjective and may introduce errors. We propose a machine learning-based system to perform GC-MS data analysis that exploits deep learning pattern recognition ability to learn and automatically detect VOCs directly from raw data, thus bypassing expert-led processing. We evaluate this new approach on clinical samples and with four types of convolutional neural networks (CNNs): VGG16, VGG-like, densely connected and residual CNNs. The proposed machine learning methods showed to outperform the expert-led analysis by detecting a significantly higher number of VOCs in just a fraction of time while maintaining high specificity. These results suggest that the proposed novel approach can help the large-scale deployment of breath-based diagnosis by reducing time and cost, and increasing accuracy and consistency.
Keyphrases
- data analysis
- machine learning
- deep learning
- convolutional neural network
- gas chromatography mass spectrometry
- artificial intelligence
- big data
- clinical practice
- endothelial cells
- healthcare
- public health
- solid phase extraction
- gas chromatography
- adverse drug
- gene expression
- label free
- high resolution
- high throughput
- emergency department
- depressive symptoms
- single cell
- quantum dots
- physical activity