中文题名: | 基于夜间灯光数据的广东省台风灾害经济损失快速评估 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 0705Z3 |
学科专业: | |
学生类型: | 硕士 |
学位: | 理学硕士 |
学位类型: | |
学位年度: | 2022 |
校区: | |
学院: | |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2022-06-17 |
答辩日期: | 2022-06-02 |
外文题名: | QUICKLY ASSESSMENT OF ECONOMIC LOSSES FROM TYPHOON DISASTERS IN GUANGDONG PROVINCE BASED ON NIGHTTIME LIGHT DATA |
中文关键词: | |
中文摘要: |
我国每年因台风造成了重大的人员伤亡和经济损失, 而对台风灾害进行及时的经济损失评估,将对有关部门及时制定防灾减灾政策、施行有效的灾后重建措施具有重要作用。本文以台风灾害在广东省造成的直接经济损失为研究对象,将研究区的夜间灯光数据与土地利用数据和经济年鉴统计数据结合进行GDP 空间化模拟,选取台风“天鸽”和“山竹”典型案例数据,分别用随机森林(RFR)和梯度提升(GBR)两种集成学习算法模型构建,用网格搜索法寻找对台风灾害直接经济损失贡献较高的指标参数,对生成的四个模型进行分析对比,最终构建台风灾害经济损失快速定量评估模型,得出以下结论: (1)致灾因子中降水量极值、暴雨日数、风速极值、台风受灾范围和孕灾环境中的离河距离等指标构成的主成分在四个模型中对台风灾害直接经济损失定量评估中的贡献率均高于0.45。其余指标如降水集中度指数、降水总量和坡度等指标构成的主成分在机器学习中的贡献率较低,约为0.2,但对于提升评估精度也有一定作用。另外的一些社会型指标有较高的复杂性,不同省份的经济指标易损性各有不同,无法一概而论。 (2)“天鸽”案例中随机森林和梯度提升两种算法得分为0.665 和0.661,“山竹”案例中两种算法得分为0.794 和0.721,GBR 算法的表现略好于RFR 算法,这是因为GBR算法在每一个回归树训练后通过残差优化下一个回归树,又通过步长控制每一个回归树的贡献,在大量弱学习器的基础上随着迭代不断逼近最优解,在每一个回归树训练后通过残差优化下一个回归树,又通过步长控制每一个回归树的贡献,在大量弱学习器的基础上随着迭代不断逼近最优解,与RFR 算法的原理比起来更为精巧,只要数据量足以充分训练模型,GBR 算法可以对数据缺失地区的直接经济损失做出快速合理的预测。 (3)本研究中用基于交叉验证的网格搜索法进行参数标定,用R2 作为评分标准,得到了每个案例的最佳模型及其参数,GBR 模型对于“天鸽”案例的学习器个数和学习率最佳参数组合为(155, 0.44),“山竹”案例的最佳参数组合为(75, 0.21)。由于台风灾害损失评估问题具有复杂性,基于交叉验证的网格搜索通过网格搜索穷举所有的参数对进行测试,通过交叉验证法保证参数选取工作的稳健性,从而最大化机器学习算法的效果。 (4)本研究所提出的基于GBR 和网格搜索的县级台风灾害损失评估方法R2 值为0.794,测试集平均误差为32.02%。在与基于SVR 的台风损失评估方法比较中,本研究使用GBR 算法和SVR 算法分别对台风“山竹”案例数据集和广东省1998~2008 年间的36个台风损失数据集进行拟合,结果显示GBR 模型对两个数据集进行拟合的测试集R2 分别为0.794 和0.893,SVR 模型对两个数据集的拟合结果分别为0.503 和0.61,GBR 模型拟合优度平均提升49.3%,证明本研究所提出模型不仅效果较SVR 模型有明显提升,而且对以单次台风经济损失为数据单元的研究也有较好的拟合效果。 本文的创新点在于:(1)首次提出了对广东省县级台风灾害直接经济损失进行定量评估的方法,能计算出不同县级地区灾害损失的具体数值,比现有的定性方法具有更好的参考价值。(2)首次提出用集成学习算法对台风灾害的直接经济损失进行评估,以GBR 和随机森林为代表的集成学习在防止过拟合、提升算法效果和对小数据量的适配性等方面有明显的优势,在许多领域都得到了广泛使用。(3)本研究利用栅格化的数据提升了模型的拟合优度。(4)在应用方面,本研究所提出的方法基于案例数据对广东省县级单元的台风直接经济损失进行评估,可以对县级地区灾害数据进行补充和校正,在灾害发生后为应急和灾后重建工作提供参考,即使训练数据量有限也能得到不错的结果,相较于大空间尺度的研究来说,县级尺度上的研究对政府工作更具有指导意义。 |
外文摘要: |
In China, typhoons cause heavy casualties and economic losses each year. Timely assessment of economic losses from typhoon disasters plays an important role in the timely formulation of disaster prevention and mitigation policies by relevant departments and the implementation of effective post-disaster reconstruction measures. This paper takes the direct economic losses caused by typhoon disasters in Guangdong Province as the research object, and combines the nighttime light data in the study area with the land use data and economic yearbook statistical data to simulate the spatial GDP. The case data is compared and analyzed through two integrated learning algorithms, random forest (RFR) and gradient boosting (GBR), and the grid search method is used to find the parameters that contribute more to the direct economic loss of typhoon disasters, so as to build a rapid quantitative model of economic losses from typhoon disasters. Concluded as follow: (1) Among the disaster-causing factors, the extreme value of precipitation, the number of rainstorm days, the extreme value of wind speed, the typhoon-affected area, and the distance from the river in the disaster-affected environment are of great significance for the quantitative assessment of the direct economic losses of typhoon disasters. The importance of other indicators such as precipitation concentration index, total precipitation, and slope is not high, but they also play a certain role in improving the evaluation accuracy. Some other social indicators have higher complexity. The vulnerability of these social indicators varies by region and cannot be generalized. (2) The scores of the random forest and gradient boosting algorithms are 0.665 and 0.661 in the ”Tiange” case, respectively, and the scores of the two algorithms in the ” Mangkhut” case are 0.794 and 0.721. The performance of GBR algorithm is slightly better than that of RFR algorithm. This is because, compared with the RFR algorithm, the principle of the GBR algorithm is more complex, and it is more suitable for typhoon disaster loss assessment. As long as the amount of data is sufficient to adequately train the model, the GBR algorithm can quickly and reasonably predict direct economic losses in areas with missing data. (3) In this study, the grid search method based on cross-validation is used for parameter selection, and good results are obtained. The optimal parameter combination of the number of learners and the learning rate of the GBR model for the ”Tiange” case is (155, 0.44), and the optimal parameter combination for the ”Mangkhut” case is (75, 0.21). Due to the complexity of typhoon disaster loss assessment, grid search based on cross-validation uses grid search to exhaust all parameter pairs for testing, and adopts cross-validation method to ensure the robustness of parameter selection, so as to maximize the use of machine learning algorithms . (4) The R2 value of the county-level typhoon disaster loss assessment method based on GBR and grid search proposed in this paper is 0.794, and the average error of the test set is 32.02%. At present, there is no quantitative assessment method for county-level typhoon disaster losses in China. In the quantitative study with typhoon as the smallest unit, compared with the existing model, the prediction accuracy of the model proposed in this study has been improved. |
参考文献总数: | 74 |
馆藏号: | 硕0705Z3/22026 |
开放日期: | 2023-06-17 |