欢迎访问《应用生态学报》官方网站,今天是 分享到:

应用生态学报 ›› 2024, Vol. 35 ›› Issue (3): 789-796.doi: 10.13287/j.1001-9332.202403.016

• • 上一篇    下一篇

基于XGBoost模型的土壤中阿特拉津降解预测

李香伶1, 陈奉献2, 陈希娟2*   

  1. 1沈阳化工大学环境与安全工程学院, 沈阳 110142;
    2中国科学院沈阳应用生态研究所, 沈阳 110016
  • 收稿日期:2023-08-31 修回日期:2024-01-29 出版日期:2024-03-18 发布日期:2024-06-18
  • 通讯作者: *E-mail: chenxj@iae.ac.cn
  • 作者简介:李香伶, 女, 1997年生, 硕士研究生。主要从事土壤中污染物环境归趋研究。E-mail: a17614221249@163.com
  • 基金资助:
    黑土地保护与利用科技创新工程专项(XDA28090100)

Prediction of atrazine degradation in soil based on XGBoost model

LI Xiangling1, CHEN Fengxian2, CHEN Xijuan2*   

  1. 1School of Environmental and Safety Engineering, Shenyang University of Chemical Technology, Shen-yang 110142, China;
    2Institute of Applied Ecology, Chinese Academy of Sciences, Shenyang 110016, China
  • Received:2023-08-31 Revised:2024-01-29 Online:2024-03-18 Published:2024-06-18

摘要: 利用自动机器学习方法建立预测土壤中除草剂阿特拉津降解效率的最佳模型,可评估土壤中阿特拉津的残存风险。本研究收集了49篇已发表文献中的494对数据,选择土壤pH、有机质含量、饱和导水率、土壤湿度、阿特拉津初始浓度、培养时间和接菌量7个因素作为输入特征,以阿特拉津在土壤中的一级反应速率常数作为输出特征,建立了6种预测土壤中阿特拉津降解效率的模型。通过线性回归和相关评价指标对模型性能进行综合分析。结果表明: XGBoost模型在预测一级反应速率常数(k)方面性能表现最佳。基于预测模型获得各因素的特征重要性排名,依次为土壤湿度>培养时间>pH>有机质>阿特拉津初始浓度>饱和导水率>接菌量;应用SHAP解释各特征与土壤中阿特拉津降解能力间的潜在联系以及各特征贡献度发现,时间对k有负贡献,而饱和导水率则对k有正贡献。土壤湿度、阿特拉津初始浓度、pH、接菌量和有机质含量的高值普遍分布在SHAP=0两侧,说明它们对土壤中阿特拉津降解存在复杂贡献。XGBoost模型结合SHAP方法在预测k性能和可解释性方面具有较高的准确性。通过机器学习方法,充分挖掘历史试验数据的价值,利用环境参数对阿特拉津降解效率进行预测,对设定阿特拉津施用阈值、降低土壤中阿特拉津的残留和扩散风险、保障土壤环境的安全具有重要意义。

关键词: XGBoost模型, 预测, 阿特拉津, 土壤, 降解

Abstract: We established the optimal model by using the automatic machine learning method to predict the degradation efficiency of herbicide atrazine in soil, which could be used to assess the residual risk of atrazine in soil. We collected 494 pairs of data from 49 published articles, and selected seven factors as input features, including soil pH, organic matter content, saturated hydraulic conductivity, soil moisture, initial concentration of atrazine, incubation time, and inoculation dose. Using the first-order reaction rate constant of atrazine in soil as the output feature, we established six models to predict the degradation efficiency of atrazine in soil, and conducted comprehensive analysis of model performance through linear regression and related evaluation indicators. The results showed that the XGBoost model had the best performance in predicting the first-order reaction rate constant (k). Based on the prediction model, the feature importance ranking of each factor was in an order of soil moisture > incubation time > pH > organic matter > initial concentration of atrazine > saturated hydraulic conductivity > inoculation dose. We used SHAP to explain the potential relationship between each feature and the degradation ability of atrazine in soil, as well as the relative contribution of each feature. Results of SHAP showed that time had a negative contribution and saturated hydraulic conductivity had a positive contribution. High values of soil moisture, initial concentration of atrazine, pH, inoculation dose and organic matter content were generally distributed on both sides of SHAP=0, indicating their complex contributions to the degradation of atrazine in soil. The XGBoost model method combined with the SHAP method had high accuracy in predicting the performance and interpretability of the k model. By using machine learning method to fully explore the value of historical experimental data and predict the degradation efficiency of atrazine using environmental parameters, it is of great significance to set the threshold for atrazine application, reduce the residual and diffusion risks of atrazine in soil, and ensure the safety of soil environment.

Key words: XGBoost model, prediction, atrazine, soil, degradation