Decision tree for estimating groundwater contaminant through proxies considering seasonality and soil saturation.
Saha DaujiTirumalesh KeesariPublished in: Environmental monitoring and assessment (2021)
Chloride ion is an important indicator of water quality. Field measurement of chloride is difficult whereas laboratory measurement is both time-consuming and chemical intensive. The conservative nature of chloride and good correlation with electrical conductivity (EC) justifies its use as proxy for chloride estimations. Comparison of the best regression models (RMs) and data-driven decision tree (DT) model enables appreciation of relative merits of the two approaches for this purpose. Quantitative improvements over the models from literature are, increase in correlation (RM: 0.70 to 0.77; DT: 0.70 to 0.78) and decrease in relative errors (RM: MARE: 0.88 to 0.65 and RMSRE: 1.91 to 0.92; DT: MARE: 0.88 to 0.40; RMSRE: 1.91 to 0.54); thereby, DT has emerged as the better modeling approach for this case. Considering the influence of seasonality (pre-or post-monsoon) and degree of saturation of soil (water logged or water depleted) enabled the reduction of the correlation range (0.24-0.87) of the basic variables to a smaller range (0.44-0.89) for estimates of Cl-, along with relative error ranging from 0.35 to 0.57, the improvement being more pronounced for lower value of variable correlations. The overall comparison using the evaluation datasets between RM from literature and RM/DT models from this study exemplified that for the study area, the case-specific models developed using the data-driven tool: DT resulted in the most accurate estimation of chloride in groundwater from the chosen proxy: EC.