Protected Geographical Indication Discrimination of Zhejiang and Non-Zhejiang Ophiopogonis japonicus by Near-Infrared (NIR) Spectroscopy Combined with Chemometrics: The Influence of Different Stoichiometric and Spectrogram Pretreatment Methods.
Qingge JiChaofeng LiXianshu FuJinyan LiaoXuezhen HongXiaoping YuZihong YeMingzhou ZhangYulou QiuPublished in: Molecules (Basel, Switzerland) (2023)
This paper presents a method for the protected geographical indication discrimination of Ophiopogon japonicus from Zhejiang and elsewhere using near-infrared (NIR) spectroscopy combined with chemometrics. A total of 3657 Ophiopogon japonicus samples from five major production areas in China were analyzed by NIR spectroscopy, and divided into 2127 from Zhejiang and 1530 from other areas ('non-Zhejiang'). Principal component analysis (PCA) was selected to screen outliers and eliminate them. Monte Carlo cross validation (MCCV) was introduced to divide the training set and test set according to a ratio of 3:7. The raw spectra were preprocessed by nine single and partial combination methods such as the standard normal variable (SNV) and derivative, and then modeled by partial least squares regression (PLSR), a support vector machine (SVM), and soft independent modeling of class analogies (SIMCA). The effects of different pretreatment and chemometrics methods on the model are discussed. The results showed that the three pattern recognition methods were effective in geographical origin tracing, and selecting the appropriate preprocessing method could improve the traceability accuracy. The accuracy of PLSR after the standard normal variable was better, with R 2 reaching 0.9979, while that of the second derivative was the lowest with an R 2 of 0.9656. After the SNV pretreatment, the accuracy of the training set and test set of SVM reached the highest values, which were 99.73% and 98.40%, respectively. The accuracy of SIMCA pretreated with SNV and MSC was the highest for the origin traceability of Ophiopogon japonicus , which could reach 100%. The distance between the two classification models of SIMCA-SNV and SIMCA-MSC is greater than 3, indicating that the SIMCA model has good performance.