- 无标题文档
查看论文信息

中文题名:

 基于机器学习的地质灾害易发单元识别研究——以青海黄湟谷地为例    

姓名:

 李兴宇    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 0705Z3    

学科专业:

 自然灾害学    

学生类型:

 硕士    

学位:

 理学硕士    

学位类型:

 学术学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 地理科学学部    

研究方向:

 灾害评估    

第一导师姓名:

 王瑛    

第一导师单位:

 地理科学学部    

提交日期:

 2023-06-12    

答辩日期:

 2023-06-02    

外文题名:

 The study on susceptible units identification of geological disaster based on machine learning ——Take Huanghuang Valley in Qinghai as an Example    

中文关键词:

 地质灾害 ; 易发单元识别 ; 机器学习 ; BP神经网络 ; Bagging    

外文关键词:

 geological disaster ; susceptible units identification ; machine learning ; BP neural network ; Bagging    

中文摘要:

气候变化背景下我国青海、甘肃等西北地区的地质灾害增加趋势明显。青海省黄湟谷地是地质灾害高发区,也是青海主要的居民聚居区,对该地区高精度的地质灾害易发单元识别,是高效开展地质灾害防治的基础与关键。

本文通过地质灾害影响因素数据、地质灾害隐患点数据,建立以斜坡单元为基本评价单元的地质灾害易发单元数据集,基于机器学习BP神经网络模型与Bagging模型,对青海黄湟谷地的地质灾害易发单元进行识别研究。主要结论如下:

(1)建立了以斜坡单元为基本评价单元的黄湟谷地区域地质灾害易发单元数据集。本文根据地质灾害易发单元识别评估的需求,考虑不同影响因素对地质灾害易发性的影响,选取了降水量、坡度、地形起伏度、距断裂带距离、岩性等13个影响因素作为参数;采用GIS水文分析工具对黄湟谷地进行斜坡单元划分,采用斜坡单元作为地质灾害易发性识别的基本单元,结合已查明的760个地质灾害隐患单元,建立了黄湟谷地区域共54272个斜坡单元的地质灾害易发单元样本数据集。

(2)针对地质灾害易发单元样本的不平衡,构建了基于BP神经网络模型的地质灾害易发单元识别方法。地质灾害易发单元识别样本数据集中存在着易发单元与非易发单元数量高度不平衡的问题,本文采用混合采样的方法进行了改善,利用BP模型对区域内混合采样后的斜坡单元进行了预测分类,模型对易发单元的识别准确率达到了86.6%,模型总体准确率达到了81.1%,AUC值为0.905,实现了性能较佳的分类效果。

(3)采用Bagging集成学习方法,实现对黄湟谷地区域的地质灾害易发单元高准确率识别。为了增强识别模型的泛化能力与稳定性,以BP神经网络模型作为基模型,建立基于Bagging模型的地质灾害易发单元识别方法。Bagging模型对黄湟谷地区域易发单元的识别准确率达到了94.5%,总体准确率为89.2%,AUC值为0.945,精度明显优于传统多元Logistic回归统计模型(其易发单元识别准确率为87.8%,总体准确率为73.9%),在另一区域Bagging模型的易发单元识别准确率为90.6%,总体准确率为90.9%,AUC值为0.926。因此,Bagging机器学习模型在地质灾害易发单元识别上,数据挖掘能力更强,有效改善BP弱学习模型的泛化能力弱、过拟合问题。

本文以斜坡单元为基本评价单元,构建了基于BP神经网络与Bagging算法的青海黄湟谷地地质灾害易发单元识别模型,准确率较高,为青海地质灾害防治工作提供技术支持,具有较强的现实意义。

外文摘要:

In the context of climate change, the trend of increasing geological disaster is obvious in northwest China, such as Qinghai and Gansu. The Yellow River Valley and Huangshui Valley in Qinghai Province is a high geological hazard area. And it is also a major residential area in Qinghai. It is the basis and key for efficient geological hazard prevention and control to determine the high-precision geological disaster susceptible units in the area.

This study establishes a geological disaster susceptible units dataset with slope units as the basic evaluation units through geological disaster impact factor data and geological hazard hidden point data. Based on the machine learning model BP neural network model and Bagging model, the susceptible units of geological disaster in the Huanghuang Valley of Qinghai are identified and studied. The main conclusions are as follows:

1) The geological disaster susceptible units dataset of the Huanghuang Valley with slope units as the basic evaluation units was established. According to the needs of geological disaster susceptible units identification and evaluation, the study selected 13 influencing factors such as precipitation, slope, terrain undulation, distance from the fracture zone, lithology as parameters, considering the influence of different influencing factors on geological hazard susceptibility. The GIS hydrological analysis tool was used to divide slope units in the Huanghuang Valley area. And the slope units was adopted as the basic units for the identification of geological disaster susceptibility. The sample dataset of 54,272 slope units in the Huanghuang Valley area was established by combining 760 identified geological disaster potential units.

2) The method of identifying geological disaster susceptible units based on the BP neural network model is proposed to address the problem of unbalanced samples. The problem of a high imbalance between the number of vulnerable and non-vulnerable units in the geological disaster susceptible units identification sample dataset was improved by using a hybrid sampling method in this study. And the BP model was used to predict the classification of the slope units in the study area. The model achieved 86.6% accuracy in the identification of the susceptible units, and the overall accuracy reached 81.1% with AUC=0.905, achieving a better performance in classification.

3) The Bagging method is used to achieve a high accuracy rate for the identification of geological disaster susceptible units in the Huanghuang Valley. In order to enhance the generalization ability and stability of the recognition model, the method of identifying geological disaster susceptible units based on the Bagging model is established. And the Bagging model uses the BP neural network model as the base model. The Bagging model achieved 94.5% accuracy in the identification of the susceptible units, and the overall accuracy reached 89.2% with AUC=0.945. Its accuracy is significantly better than the multiple logistic regression model (87.8% accuracy for prone cells and 73.9% overall accuracy). In the other region the Bagging model had an accuracy of 90.6% for susceptible units identification and 90.9% for overall accuracy with AUC=0.926. Therefore, the Bagging model is more capable of data mining for geological disaster susceptible units identification, effectively improving the weak generalisation and overfitting problems of the BP model.

In this study, a geological disaster susceptible units identification model is constructed with a high accuracy rate based on BP model and Bagging model. The research provides technical support for the prevention and control of geological hazards in Qinghai and has strong practical significance.

参考文献总数:

 107    

馆藏号:

 硕0705Z3/23013    

开放日期:

 2024-06-12    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式