Decoding river pollution trends and their landscape determinants in an ecologically fragile karst basin using a machine learning model



Xu G, Fan H, Oliver DM, Dai Y, Li H, Shi Y, Long H, Xiong K & Zhao Z (2022) Decoding river pollution trends and their landscape determinants in an ecologically fragile karst basin using a machine learning model. Environmental Research, 214 (Part 4), Art. No.: 113843.

Karst watersheds accommodate high landscape complexity and are influenced by both human-induced and natural activity, which affects the formation and process of runoff, sediment connectivity and contaminant transport and alters natural hydrological and nutrient cycling. However, physical monitoring stations are costly and labor-intensive, which has confined the assessment of water quality impairments on spatial scale. The geographical characteristics of catchments are potential influencing factors of water quality, often overlooked in previous studies of highly heterogeneous karst landscape. To solve this problem, we developed a machining learning method and applied Extreme Gradient Boosting (XGBoost) to predict the spatial distribution of water quality in the world's most ecologically fragile karst watershed. We used the Shapley Addition interpretation (SHAP) to explain the potential determinants. Before this process, we first used the water quality damage index (WQI-DET) to evaluate the water quality impairment status and determined that CODMn, TN and TP were causing river water quality impairments in the WRB. Second, we selected 46 watershed features based on the three key processes (sources-mobilization-transport) which affect the temporal and spatial variation of river pollutants to predict water quality in unmonitored reaches and decipher the potential determinants of river impairments. The predicting range of CODMn spanned from 1.39 mg/L to 17.40 mg/L. The predictions of TP and TN ranged from 0.02 to 1.31 mg/L and 0.25–5.72 mg/L, respectively. In general, the XGBoost model performs well in predicting the concentration of water quality in the WRB. SHAP explained that pollutant levels may be driven by three factors: anthropogenic sources (agricultural pollution inputs), fragile soils (low organic carbon content and high soil permeability to water flow), and pollutant transport mechanisms (TWI, carbonate rocks). Our study provides key data to support decision-making for water quality restoration projects in the WRB and information to help bridge the science:policy gap.

Ecologically fragile karst basin; Water quality assessment; XGBoost regression; Shapley additive explanations; Determinant analysis;

Environmental Research: Volume 214, Issue Part 4

FundersNational Natural Science Foundation of China
Publication date30/11/2022
Publication date online02/08/2022
Date accepted by journal04/07/2022
PublisherElsevier BV

People (1)


Dr David Oliver
Dr David Oliver

Associate Professor, Biological and Environmental Sciences