Login / Signup

Compound Similarity Network as a Novel Data Mining Strategy for High-Throughput Investigation of Degradation Pathways of Organic Pollutants in Industrial Wastewater Treatment.

Lirong AnBin ChenYuchen ZhangHailiang LiRongfu HuangFeng LiYanan Tang
Published in: Analytical chemistry (2024)
Identification of degradation products and pathways is crucial for investigating emerging pollutants and evaluation of wastewater treatment methods. Nontargeted analysis is a powerful tool to comprehensively investigate the degradation pathways of organic pollutants in real-world wastewater samples but often generates large data sets, making it difficult to effectively locate the exact information on interests. Herein, to efficiently establish the linkages among compounds in the same degradation pathways, we introduce a compound similarity network (CSN) as a novel data mining strategy for LC-MS-based nontargeted analysis of complex wastewater samples. Different from molecular networks that cluster compounds based on MS/MS spectra similarity, our CSN strategy harnesses molecular fingerprints to establish linkages among compounds and thus is spectra-independent. The effectiveness of CSN was demonstrated by nontargeted identification of degradation pathways and products of organic pollutants in leather industrial wastewater that underwent laboratory-scale activated carbon adsorption (ACD) and ozonation treatments. Utilizing CSN in interpreting nontargeted data, we tentatively annotated 4324 compounds in the untreated leather industrial wastewater, 3246 after ACD, and 3777 after ACD/ozonation. We located 145 potential degradation pathways of organic pollutants in the ACD/ozonation process using CSN and validated 7 pathways with 15 chemical standards. CSN also revealed 5 clusters of emerging pollutants, from which 3 compounds were selected for in vitro cytotoxicity study to evaluate their potential biohazards as new pollutants. As CSN offers an efficient way to connect massive compounds and to find multiple degradation pathways in a high-throughput manner, we anticipate that it will find wide applications in nontargeted analysis of diverse environmental samples.
Keyphrases