Login / Signup

Norm ISWSVR: A Data Integration and Normalization Approach for Large-Scale Metabolomics.

Xian DingFen YangYanhua ChenJing XuJiuming HeRuiping ZhangZeper Abliz
Published in: Analytical chemistry (2022)
Large-scale and long-period metabolomics study is more susceptible to various sources of systematic errors, resulting in nonreproducibility and poor data quality. A reliable and robust batch correction method removes unwanted systematic variations and improves the statistical power of metabolomics data, which undeniably becomes an important issue for the quality control of metabolomics. This study proposed a novel data normalization and integration method, Norm ISWSVR. It is a two-step approach via combining the best-performance internal standard correction with support vector regression normalization, comprehensively removing the systematic and random errors and matrix effects. This method was investigated in three untargeted lipidomics or metabolomics datasets, and the performance was further evaluated systematically in comparison with that of 11 other normalization methods. As a result, Norm ISWSVR decreased the data's median cross-validated relative standard deviation (cvRSD), increased the correlation between QCs, improved the classification accuracy of biomarkers, and was well-compatible with quantitative data. More importantly, Norm ISWSVR also allows a low frequency of QCs, which could significantly decrease the burden of a large-scale experiment. Correspondingly, Norm ISWSVR favorably improves the data quality of large-scale metabolomics data.
Keyphrases
  • electronic health record
  • mass spectrometry
  • big data
  • machine learning
  • quality control
  • data analysis
  • risk factors
  • liquid chromatography
  • rna seq