Welcome to Chinese Journal of Applied Ecology! Today is Share:

Chinese Journal of Applied Ecology ›› 2020, Vol. 31 ›› Issue (10): 3509-3517.doi: 10.13287/j.1001-9332.202010.018

• Original Articles • Previous Articles     Next Articles

Assessing soil pH in Anhui Province based on different features mining methods combined with generalized boosted regression models

WANG Shi-hang1,2*, LU Hong-liang1, ZHAO Ming-song1,2, ZHOU Ling-mei1   

  1. 1School of Geomatics, Anhui University of Science and Technology, Huainan 232001, Anhui, China;
    2State Key Laboratory of Soil and Sustainable Agriculture, Institute of Soil Science, Chinese Academy of Sciences, Nanjing 210008, China
  • Received:2020-05-06 Accepted:2020-08-11 Online:2020-10-15 Published:2021-04-15
  • Contact: * E-mail: wangshihang122@163.com
  • Supported by:
    National Natural Science Foundation of China (31700369, 41501226).

Abstract: We explored the application of different feature mining methods combined with genera-lized boosted regression models in digital soil mapping. Environmental covariates were selected by two feature selection methods i.e., recursive feature elimination and selection by filtering. Using the original environmental covariates and the selected optimal variable combination as independent varia-bles, soil pH prediction model of Anhui Province was established and mapped based on the genera-lized boosted regression model and random forest model. The results showed that both kinds of feature mining methods could effectively improve the accuracy of soil pH prediction by generalized boosted regression models and random forest model, and could reduce dimensionality. Compared with the random forest model, the prediction accuracy of the validation set of the generalized boosted regression model was slightly lower. In the training set, the accuracy of the generalized boosted regression models was much higher than that of the random forest model, with higher interpretation and better overall effect. The main parameters of the random forest model, ntree and mtry, had limi-ted effect on the model. Different parameters and their combination could affect the prediction accuracy of the generalized boosted regression models, and thus should be tuned before modeling. The results of spatial mapping showed that soil pH in Anhui Province showed a pattern of “south acid and north alkali”.

Key words: soil pH, feature mining, generalized boosted regression models, random forest, machine learning, Anhui Province