AntDAS: Automatic Data Analysis Strategy for UPLC-QTOF-Based Nontargeted Metabolic Profiling Analysis.
Hai-Yan FuXiao-Ming GuoYue-Ming ZhangJing-Jing SongQing-Xia ZhengPing-Ping LiuPeng LuQian-Si ChenYong-Jie YuYuanbin ShePublished in: Analytical chemistry (2017)
High-quality data analysis methodology remains a bottleneck for metabolic profiling analysis based on ultraperformance liquid chromatography-quadrupole time-of-flight mass spectrometry. The present work aims to address this problem by proposing a novel data analysis strategy wherein (1) chromatographic peaks in the UPLC-QTOF data set are automatically extracted by using an advanced multiscale Gaussian smoothing-based peak extraction strategy; (2) a peak annotation stage is used to cluster fragment ions that belong to the same compound. With the aid of high-resolution mass spectrometer, (3) a time-shift correction across the samples is efficiently performed by a new peak alignment method; (4) components are registered by using a newly developed adaptive network searching algorithm; (5) statistical methods, such as analysis of variance and hierarchical cluster analysis, are then used to identify the underlying marker compounds; finally, (6) compound identification is performed by matching the extracted peak information, involving high-precision m/z and retention time, against our compound library containing more than 500 plant metabolites. A manually designed mixture of 18 compounds is used to evaluate the performance of the method, and all compounds are detected under various concentration levels. The developed method is comprehensively evaluated by an extremely complex plant data set containing more than 2000 components. Results indicate that the performance of the developed method is comparable with the XCMS. The MATLAB GUI code is available from http://software.tobaccodb.org/software/antdas .