Ensemble learning-based applied research on heavy metals prediction in a soil-rice system.

The Science of the total environment(2023)

引用 0|浏览8
暂无评分
摘要
Accurate prediction of heavy metal accumulation in soil ecosystems is crucial for maintaining healthy soil environments and ensuring high-quality agricultural products, as well as a challenging scientific task. In this study, we constructed a dataset containing 490 sets of multidimensional environmental covariate data and proposed prediction models for heavy metal concentrations (HMC) in a soil-rice system, EL-HMC (including RF-HMC and GBM-HMC), based on Random Forest (RF) and Gradient Boosting Machine (GBM) ensemble learning (EL) techniques. To reasonably evaluate the effectiveness of each model, Multiple linear and Bayesian regressions were selected as benchmark models (BM), and mean absolute error (MAE), root mean square error (RMSE), and determination coefficient R were selected as evaluation indicators. In addition, sensitivity and spatial autocorrelation (SAC) analyses were used to examine the robustness of the model. The results showed that the R values of RF-HMC and GBM-HMC for modeling available cadmium (Cd) concentrations in soil were 0.654 and 0.690, respectively, with an average increase of 48.0 % compared to the BMs. The R values of RF-HMC and GBM-HMC for predicting Cd, lead (Pb), chromium (Cr), and mercury (Hg) concentrations in rice ranged from 0.618 to 0.824 and 0.645 to 0.850, respectively, with an average increase of 58.2 % compared with the BMs. The corresponding MAEs and RMSEs of RF-HMC and GBM-HMC had low error levels. Sensitivity analysis of the input features and the SAC of the prediction bias showed that the EL-HMC models have excellent robustness. Therefore, the EL technology-based prediction models for HMCs proposed herein are practical and feasible, demonstrating better accuracy and stability than the traditional model. This study verifies the application potential of EL technology in pollution ecology and provides a new perspective and solution for sustainable management and precise prevention of heavy metal pollution in farmland soil at the regional scale.
更多
查看译文
关键词
Environmental factor,Random forest,Gradient boosting machine,Sensitivity,Spatial pattern
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要