pseudoQC: A Regression-Based Simulation Software for Correction and Normalization of Complex Metabolomics and Proteomics Datasets.

Published in: Proteomics (2019)

Various types of unwanted and uncontrollable signal variations in MS-based metabolomics and proteomics datasets severely disturb the accuracies of metabolite and protein profiling. Therefore, pooled quality control (QC) samples are often employed in quality management processes, which are indispensable to the success of metabolomics and proteomics experiments, especially in high-throughput cases and long-term projects. However, data consistency and QC sample stability are still difficult to guarantee because of the experimental operation complexity and differences between experimenters. To make things worse, numerous proteomics projects do not take QC samples into consideration at the beginning of experimental design. Herein, a powerful and interactive web-based software, named pseudoQC, is presented to simulate QC sample data for actual metabolomics and proteomics datasets using four different machine learning-based regression methods. The simulated data are used for correction and normalization of the two published datasets, and the obtained results suggest that nonlinear regression methods perform better than linear ones. Additionally, the above software is available as a web-based graphical user interface and can be utilized by scientists without a bioinformatics background. pseudoQC is open-source software and freely available at https://www.omicsolution.org/wukong/pseudoQC/.

Keyphrases