LiPydomics: A Python Package for Comprehensive Prediction of Lipid Collision Cross Sections and Retention Times and Analysis of Ion Mobility-Mass Spectrometry-Based Lipidomics Data.
Dylan H RossJang Ho ChoRutan ZhangKelly M HinesLibin XuPublished in: Analytical chemistry (2020)
Comprehensive profiling of lipid species in a biological sample, or lipidomics, is a valuable approach to elucidating disease pathogenesis and identifying biomarkers. Currently, a typical lipidomics experiment may track hundreds to thousands of individual lipid species. However, drawing biological conclusions requires multiple steps of data processing to enrich significantly altered features and confident identification of these features. Existing solutions for these data analysis challenges (i.e., multivariate statistics and lipid identification) involve performing various steps using different software applications, which imposes a practical limitation and potentially a negative impact on reproducibility. Hydrophilic interaction liquid chromatography-ion mobility-mass spectrometry (HILIC-IM-MS) has shown advantages in separating lipids through orthogonal dimensions. However, there are still gaps in the coverage of lipid classes in the literature. To enable reproducible and efficient analysis of HILIC-IM-MS lipidomics data, we developed an open-source Python package, LiPydomics, which enables performing statistical and multivariate analyses ("stats" module), generating informative plots ("plotting" module), identifying lipid species at different confidence levels ("identification" module), and carrying out all functions using a user-friendly text-based interface ("interactive" module). To support lipid identification, we assembled a comprehensive experimental database of m/z and CCS of 45 lipid classes with 23 classes containing HILIC retention times. Prediction models for CCS and HILIC retention time for 22 and 23 lipid classes, respectively, were trained using the large experimental data set, which enabled the generation of a large predicted lipid database with 145,388 entries. Finally, we demonstrated the utility of the Python package using Staphylococcus aureus strains that are resistant to various antimicrobials.
Keyphrases
- mass spectrometry
- data analysis
- liquid chromatography
- fatty acid
- staphylococcus aureus
- electronic health record
- emergency department
- healthcare
- big data
- escherichia coli
- high performance liquid chromatography
- gas chromatography
- simultaneous determination
- artificial intelligence
- cystic fibrosis
- methicillin resistant staphylococcus aureus
- smoking cessation