欢迎访问《应用生态学报》官方网站,今天是 分享到:

应用生态学报 ›› 2025, Vol. 36 ›› Issue (6): 1889-1897.doi: 10.13287/j.1001-9332.202506.018

• 研究论文 • 上一篇    下一篇

基于自动机器学习模型预测作物籽粒重金属浓度

张业翔1, 陈奉献2, 张宇红1, 陈希娟2*   

  1. 1沈阳化工大学环境与安全工程学院, 沈阳 110142;
    2中国科学院沈阳应用生态研究所, 沈阳 110016
  • 收稿日期:2024-10-24 接受日期:2025-04-30 出版日期:2025-06-18 发布日期:2025-12-18
  • 通讯作者: *E-mail: chenxj@iae.ac.cn
  • 作者简介:张业翔, 男, 2000年生, 硕士研究生。主要从事土壤污染物迁移累积风险评估研究。E-mail: zyx83621115@163.com
  • 基金资助:
    黑土地保护与利用科技创新工程专项(XDA28090100)

Predicting heavy metal concentration in crop grain using automated machine learning models

ZHANG Yexiang1, CHEN Fengxian2, ZHANG Yuhong1, CHEN Xijuan2*   

  1. 1School of Environmental and Safety Engineering, Shenyang University of Chemical Technology, Shenyang 110142, China;
    2Institute of Applied Ecology, Chinese Academy of Sciences, Shenyang 110016, China
  • Received:2024-10-24 Accepted:2025-04-30 Online:2025-06-18 Published:2025-12-18

摘要: 随着工业化进程的加速和农业活动的频繁,作物重金属污染已成为当前农业生产中一个不容忽视的问题。本研究基于54篇文献的791组数据,利用自动机器学习模型对作物籽粒重金属浓度进行预测。研究选取有机肥施用量、有机肥重金属浓度、土壤重金属浓度、有机质、酸碱度、阳离子交换量、黏粒含量、砂粒含量、粉粒含量和作物类型10种影响因素作为输入变量,选取铬(Cr)、镉(Cd)、铅(Pb)、砷(As)和汞(Hg)在作物籽粒中的浓度作为输出变量,评估深度学习(DL)、分布式随机森林(DRF)、极度随机树(XRT)、堆栈集合(SE)、梯度提升机(GBM)和广义线性模型(GLM)6种模型的模拟预测效果,并分析影响作物籽粒重金属累积的关键因素。结果表明: 不同重金属的最佳预测模型存在差异。DL模型对Cr、Pb、As和Hg的预测效果最优,而GBM模型对Cd的预测精度最高。特征重要性和SHAP分析显示,有机肥施用量与作物类型是影响作物籽粒重金属累积的关键因素,有机肥施用量、土壤重金属浓度、有机肥重金属浓度、砂粒含量与作物籽粒重金属浓度呈正相关,阳离子交换量、酸碱度、有机质、黏粒含量与作物籽粒重金属浓度呈负相关。综上,DL和GBM模型在预测作物籽粒重金属浓度中具有优势,生产中需严格控制有机肥施用带来的重金属输入风险。

关键词: 机器学习, 重金属, 籽粒, 有机肥, 预测

Abstract: With the acceleration of industrialization and the intensification of agricultural activities, heavy metals (HMs) pollution in crops has become an issue that can not be ignored in current agricultural production. Based on 791 data sets from 54 publications, we predicted HMs concentrations in crop grains by using automated machine learning (AutoML) models. Ten factors were used as input variables: organic fertilizer application, HMs concentration in organic fertilizer, soil HMs concentration, soil organic matter, pH, cation exchange capacity, clay content, silt content, sand content and plant types. The concentrations of chromium (Cr), cadmium (Cd), lead (Pb), arsenic (As) and mercury (Hg) in crop grains were set as output variables. We evaluated the simulation and prediction performance of six models: deep learning (DL), distributed random forest (DRF), extremely randomized trees (XRT), stacked ensemble (SE), gradient boosting machine (GBM) and generalized linear model (GLM), with which we analyzed the key factors driving heavy metal accumulation in crop grains. The results showed that the optimal prediction model differed for different HMs. The DL model provided the best prediction for Cr, Pb, As and Hg, while the GBM model achieved the highest prediction accuracy for Cd. Feature importance and SHAP analysis revealed that the application of organic fertilizer and plant type were the key factors influencing HMs accumulation in crop grains. Organic fertilizer application, soil HMs concentration, organic fertilizer HMs concentration, and sand content were positively correlated with HMs concentration in crop grains, while cation exchange capacity, pH, organic matter, and clay content were negatively correlated with heavy metal concentration in crop grains. In summary, the DL and GBM models performed better in predicting heavy metal concentrations in crop grains. The input risk of heavy metals during organic fertilizer application must be strictly controlled.

Key words: machine learning, heavy metal, grain, organic fertilizer, prediction