
基于机器学习和SHAP算法的农业用水量预测模型构建
昝子懿, 岳卫峰, 赵航正, 曹倡铭, 胡竞丹, 胡立堂, 徐洋, 陈爱萍
基于机器学习和SHAP算法的农业用水量预测模型构建
Development of Agricultural Water-Consumption Prediction Model Based on Machine Learning and SHAP
农业用水预测是区域水资源规划中的关键环节,对于实现水资源合理开发,保障粮食安全具有重要的指导意义。然而,现有农业用水预测模型普遍存在输入参数冗余、模型精度不够等问题,不利于有效地进行水资源管理和优化决策。因此,选择内蒙河套灌区作为研究对象,首先对灌区农业用水量相关驱动因子进行主成分分析(Principal Components Analysis,简称PCA),筛选出影响灌区农业用水量的关键因子;其次构建多种基于机器学习的农业用水预测模型;最后,利用Shapley加法解释方法(SHapley Additive exPlanations,SHAP)验证最优模型应用的合理性,并深入挖掘各特征值对农业用水量的贡献影响。结果表明:多层感知器神经网络(MLP)机器学习模型可以有效的预测农业用水量,其R 2评价指标为0.84,相较于其他五种不同机器学习模型(最小绝对收缩和选择算子回归Lasso、岭回归Ridge、决策树DT、随机森林RF、极限梯度提升XGboost),该模型预测结果较好。采用SHAP值法对MLP机器学习模型中的输入参数进行量化分析,发现第一产业总产值与粮食产量有较高的绝对平均SHAP值,而在不同灌域中SHAP值贡献大小略有差异。构建农业用水量预测筛选模型可以准确预测农业用水量,从而实现灌区农业精准灌溉并提高水资源利用效率,对于缓解未来河套灌区水资源供需矛盾具有重要的实际意义。
Predicting agricultural water usage is a key element in regional water resource planning, essential for ensuring the rational development of water resources and ensuring food security. However, existing models for predicting agricultural water usage often suffer from issues such as redundant input parameters and insufficient accuracy, hindering effective water resource management and optimal decision-making. Therefore, this study selects the Hetao irrigation district in Inner Mongolia as the research area. Firstly, principal component analysis (PCA) is conducted on the driving factors related to agricultural water usage to identify the key factors influencing water usage in the irrigation district. Secondly, various machine learning-based models were constructed for predicting agricultural water usage. Finally, the SHapley Additive exPlanations (SHAP) method is used to validate the applicability of the optimal model and to deeply explore the contribution of each feature to agricultural water usage. The results show that the Multilayer Perceptron (MLP) neural network model effectively predicts agricultural water usage, with an R² evaluation index of 0.84, performing better than five other machine learning models: Least Absolute Shrinkage and Selection Operator Regression (Lasso), Ridge Regression (Ridge), Decision Tree (DT), Random Forest (RF), and eXtreme Gradient Boosting (XGBoost). Using the SHAP method to quantitatively analyze the input parameters of the MLP model reveals that the total output value of the primary industry and grain yield have higher absolute mean SHAP values, with slight differences in SHAP value contributions among different irrigation regions. Constructing an agricultural water usage prediction and screening model can accurately predict water usage, thereby achieving precise irrigation in the irrigation district and improving water resource utilization efficiency. This has significant practical implications for alleviating future water resource supply and demand conflicts in the Hetao irrigation district.
农业用水量 / 机器学习 / 用水量预测 / SHAP / 河套灌区 {{custom_keyword}} /
agricultural water usage / machine learning / water usage prediction / SHAP / Hetao irrigation district {{custom_keyword}} /
1 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
2 |
黄修桥, 康绍忠, 王景雷. 灌溉用水需求预测方法初步研究[J]. 灌溉排水学报, 2004, 23(4): 11-15.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
3 |
朱连勇, 雷晓云, 文 静. 基于定额定量法的阿克苏市需水量预测分析[J]. 水资源与水工程学报, 2012, 23(2): 13-15, 19.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
4 |
刘 迪, 胡彩虹, 吴泽宁. 基于定额定量分析的农业用水需求预测研究[J]. 灌溉排水学报, 2008, 27(6): 88-91.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
5 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
6 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
7 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
8 |
孙才志, 赵良仕. 环境规制下的中国水资源利用环境技术效率测度及空间关联特征分析[J]. 经济地理, 2013, 33(2): 26-32.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
9 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
10 |
杨登元, 鞠茂森, 唐德善. 基于改进PCA-BP神经网络模型的海宁市需水预测[J]. 水电能源科学, 2024, 42(5): 68-71, 79.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
11 |
单义明, 杨 侃. 基于灰色关联度分析的山西省PSO-SVR需水量预测模型[J]. 水电能源科学, 2021, 39(2): 18-21.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
12 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
13 |
郭亚男, 吴泽宁, 高建菊. 基于主成分分析的支持向量机需水预测模型及其应用[J]. 中国农村水利水电, 2012(7): 76-78, 82.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
14 |
杨 蕊, 胡贤群, 王 龙, 等. 基于主成分分析和模糊聚类的云南省农业节水分区[J]. 节水灌溉, 2021(4): 92-97.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
15 |
赵伟佳, 罗德才, 陈 方, 等. 基于PCA-BP神经网络的既有建筑改造成本预测[J]. 土木工程与管理学报, 2024, 41(2): 89-97.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
16 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
17 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
18 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
19 |
江 燕, 王修贵, 刘昌明, 等. 引水减少对河套永联试验区田间水均衡影响分析[J]. 水科学进展, 2009, 20(3): 356-360.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
20 |
金 巍, 刘双双, 张 可, 等. 农业生产效率对农业用水量的影响[J]. 自然资源学报, 2018, 33(8): 1 326-1 339.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
21 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
22 |
王志强, 任金哥, 韩 硕, 等. 基于可解释性机器学习的建筑物物化阶段碳排放量预测研究[J]. 安全与环境学报, 2024, 24(6): 2 454-2 466.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
23 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
24 |
窦 淼, 李金燕, 崔岚博, 等. 相关性分析-神经网络模型在宁夏用水量预测中的应用[J]. 人民珠江, 2022, 43(8): 71-77.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
25 |
徐达梁, 徐杭镔, 靳心瑶, 等. 基于机器学习的纳滤膜预测筛选模型构建与评估[J]. 哈尔滨工业大学学报, 2024, 56(6): 8-15.
{{custom_citation.content}}
{{custom_citation.annotation}}
|
26 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
27 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
28 |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
{{custom_ref.label}} |
{{custom_citation.content}}
{{custom_citation.annotation}}
|
/
〈 |
|
〉 |