Automated optimization of XCMS parameters for improved peak picking of liquid chromatography-mass spectrometry data using the coefficient of variation and parameter sweeping for untargeted metabolomics.
Sascha K ManierAndreas KellerMarkus R MeyerPublished in: Drug testing and analysis (2018)
Accurate peak picking and further processing is a current challenge in the analysis of untargeted metabolomics using liquid chromatography-mass spectrometry (LC-MS) data. The optimization of these processes is crucial to obtain proper results. This study investigated and optimized the detection of peaks by XCMS, a widely used R package for peak picking and processing of high-resolution LC-MS metabolomics data by their coefficient of variation using neat standard solutions of drug like compounds. The obtained results were additionally verified by using fortified pooled plasma samples. Settings of the mass spectrometer were optimized by recommendations in literature to enable a reliable detection of the investigated analytes. XCMS parameters were evaluated using a comprehensive parameter sweeping approach. The optimization steps were statistically evaluated and further visualized after principal component analysis (PCA). Concerning the lower concentrated solution in methanol samples, the optimization of both mass spectrometer and XCMS parameters improved the median coefficient of variation from 24% to 7%, retention time fluctuation from 9.3 seconds to 0.54 seconds, and fluctuation of the mass to charge ratio (m/z) from m/z 0.00095 to m/z 0.00028. The number of parent compounds and their related species annotated by CAMERA increased from 88 to 113 while the total amount of features decreased from 3282 to 428. Optimized MS settings such as increased resolution led to a higher specificity of peak picking. PCA supported these findings by showing the best clustering of samples after optimization of both mass spectrometer and XCMS parameters. The results implied that peak picking needs to be individually adapted for the experimental set up. Reducing unwanted variation in the data set was most successful after combining high resolving power with strict peak picking settings.
Keyphrases
- mass spectrometry
- high resolution
- liquid chromatography
- high resolution mass spectrometry
- tandem mass spectrometry
- gas chromatography
- high performance liquid chromatography
- electronic health record
- capillary electrophoresis
- big data
- simultaneous determination
- systematic review
- diffusion weighted imaging
- solid phase extraction
- high speed
- machine learning
- magnetic resonance imaging
- clinical trial
- high throughput
- artificial intelligence
- emergency department
- multiple sclerosis
- computed tomography
- label free
- loop mediated isothermal amplification
- deep learning
- randomized controlled trial
- single molecule
- atomic force microscopy
- open label
- rna seq
- convolutional neural network
- placebo controlled