Comparison of data processing strategies using commercial vs. open-source software in GC-Orbitrap-HRMS untargeted metabolomics analysis for food authentication: thyme geographical differentiation and marker identification as a case study.
Araceli Rivera-PérezAntonia Garrido FrenichPublished in: Analytical and bioanalytical chemistry (2024)
Untargeted analysis of gas chromatography-high-resolution mass spectrometry (GC-HRMS) data is a key and time-consuming challenge for identifying metabolite markers in food authentication applications. Few studies have been performed to evaluate the capability of untargeted data processing tools for feature extraction, metabolite annotation, and marker selection from untargeted GC-HRMS data since most of them are focused on liquid chromatography (LC) analysis. In this framework, this study provides a comprehensive evaluation of data analysis tools for GC-Orbitrap-HRMS plant metabolomics data, including the open-source MS-DIAL software and commercial Compound Discoverer™ software (designed for Orbitrap data processing), applied for the geographical discrimination and search for thyme markers (Spanish vs. Polish differentiation) as the case study. Both approaches showed that the feature detection process is highly affected by unknown metabolites (Levels 4-5 of identification confidence), background signals, and duplicate features that must be carefully assessed before further multivariate data analysis for reliable putative identification of markers. As a result, Compound Discoverer™ and MS-DIAL putatively annotated 52 and 115 compounds at Level 2, respectively. Further multivariate data analysis allowed the identification of differential compounds, showing that the putative identification of markers, especially in challenging untargeted analysis, heavily depends on the data processing parameters, including available databases used during compound annotation. Overall, this method comparison pointed out both approaches as good options for untargeted analysis of GC-Orbitrap-HRMS data, and it is presented as a useful guide for users to implement these data processing approaches in food authenticity applications depending on their availability.
Keyphrases
- high resolution mass spectrometry
- data analysis
- liquid chromatography
- gas chromatography
- mass spectrometry
- ultra high performance liquid chromatography
- tandem mass spectrometry
- electronic health record
- simultaneous determination
- big data
- gas chromatography mass spectrometry
- machine learning
- solid phase extraction
- risk assessment
- ms ms
- multiple sclerosis
- bioinformatics analysis