metabCombiner 2.0: Disparate Multi-Dataset Feature Alignment for LC-MS Metabolomics.
Hani HabraJennifer L MeijerTong ShenOliver FiehnDavid A GaulFacundo M FernándezKaitlin R RempfertThomas O MetzKaren E PetersonCharles R EvansAlla KarnovskyPublished in: Metabolites (2024)
Liquid chromatography-high-resolution mass spectrometry (LC-HRMS), as applied to untargeted metabolomics, enables the simultaneous detection of thousands of small molecules, generating complex datasets. Alignment is a crucial step in data processing pipelines, whereby LC-MS features derived from common ions are assembled into a unified matrix amenable to further analysis. Variability in the analytical factors that influence liquid chromatography separations complicates data alignment. This is prominent when aligning data acquired in different laboratories, generated using non-identical instruments, or between batches from large-scale studies. Previously, we developed metabCombiner for aligning disparately acquired LC-MS metabolomics datasets. Here, we report significant upgrades to metabCombiner that enable the stepwise alignment of multiple untargeted LC-MS metabolomics datasets, facilitating inter-laboratory reproducibility studies. To accomplish this, a "primary" feature list is used as a template for matching compounds in "target" feature lists. We demonstrate this workflow by aligning four lipidomics datasets from core laboratories generated using each institution's in-house LC-MS instrumentation and methods. We also introduce batchCombine, an application of the metabCombiner framework for aligning experiments composed of multiple batches. metabCombiner is available as an R package on Github and Bioconductor, along with a new online version implemented as an R Shiny App.
Keyphrases
- liquid chromatography
- mass spectrometry
- high resolution mass spectrometry
- tandem mass spectrometry
- ultra high performance liquid chromatography
- electronic health record
- capillary electrophoresis
- simultaneous determination
- gas chromatography
- machine learning
- high resolution
- big data
- rna seq
- solid phase extraction
- deep learning
- data analysis
- artificial intelligence
- social media
- case control
- healthcare
- health information
- patient reported outcomes
- water soluble
- real time pcr
- psychometric properties
- ms ms