- 无标题文档
查看论文信息

中文题名:

 基于推特的救援物资需求与紧迫程度分析 ——以海燕台风为例    

姓名:

 张婷    

保密级别:

 公开    

论文语种:

 中文    

学科代码:

 070503    

学科专业:

 地图学与地理信息系统    

学生类型:

 硕士    

学位:

 理学硕士    

学位类型:

 学术学位    

学位年度:

 2021    

校区:

 北京校区培养    

学院:

 地理科学学部    

研究方向:

 灾害、社交媒体大数据    

第一导师姓名:

 程昌秀    

第一导师单位:

 北京师范大学地理科学学部    

提交日期:

 2021-06-10    

答辩日期:

 2021-05-31    

外文题名:

 ANALYSIS OF THE DEMAND AND URGENCY OF RELIEF SUPPLIES BASED ON TWITTER ——TAKE TYPHOON HAIYAN AS AN EXAMPLE    

中文关键词:

 灾害救援物资需求 ; 时空紧迫程度 ; 推特 ; BTM ; TextBlob ; 需求词典 ; 情感分析 ; 灾害应急管理    

外文关键词:

 Demand for disaster relief supplies ; Spatio-temporal urgency ; Twitter ; BTM ; TextBlob ; demand dictionary ; sentiment analysis ; disaster emergency management    

中文摘要:

自然灾害易造成巨大的人员伤亡和经济损失,防灾减灾已成为全人类共同面临的挑战。救援物资分配是灾害应急管理中的重要环节,及时准确感知灾区群众的实际需求的紧迫程度对指导救灾物资的分配有重要的指导意义。现今,信息技术快速发展,灾害期间人们利用社交媒体发布与灾害相关的推文,为获取灾害信息提供了新的途径。然而从海量、非结构化、语言随意的社交媒体数据中获取物资需求面临诸多挑战。论文构建了一个基于主题模型和情感分析的救援物资需求和紧迫程度分析框架,以海燕台风为例,分析海燕台风期间民众的救援物资需求的空间分布与时空紧迫程度。

论文首先基于hashtag筛选的方法获取了海燕台风相关的推文,构建了BTM主题模型识别出与需求相关的推文,为从推文中提取救援物资需求信息,论文提取了推文中的高频需求词,并通过双词对扩展、同义词扩展和本地语言扩展等技术构建了救援物资需求词典。在推文缺少定位信息的情况下,论文建立了研究区三级地名词典,从推特文本中提取位置信息,并将需求与位置信息一一匹配,实现了救援物资需求的空间定位,分析了物资需求的空间分布格局。其后,用TextBlob的情感值分析方法,计算每条推文的情感值,并对情感值进行积极、消极和中性的极性划分,再对公众情感均值进行时空分析,并结合情感的日变化趋势和推文主题分类结果,探究不同时期情感变化的原因,以此推断救援物资需求的时间紧迫程度;以省为单位统计公众情感均值,分析情感均值的空间分布,并结合当地的灾情、经济情况等推断救援物资需求的空间紧迫程度。

论文研究结果表明:

1)基于BTM主题分类、需求词典和地名词典得到救援物资需求分布,需求量大的地区大多位于台风路径附近,且东部地区的需求高于其他地区。莱特岛是对救援物资需求程度最高的地区,其次是宿务省和萨马省。经过人工检验,基于BTM主题分类,识别出的需求相关类的召回率为0.894,精准率为0.72,该方法可以有效提取与需求相关的推文。本方法得到的救援物资需求程度与台风海燕的官方报告以及新闻报道基本相符。

2)基于Textblob情感计算得到救援物资需求的时空紧迫程度,1110日至1113日是灾后情感最低迷的阶段,是灾害救援的关键期也是救援物资需求最紧迫的时期。萨马省和莱特岛是“高需求-高紧迫”型地区,是在当前灾害中最需要物资且最紧迫的地区,其次是卡佩兹、阿克兰。通过人工检验,Textblob的情感计算的方法的精准率、召回率和F1-measure值均超过0.6,能够较好地识别推文的情感倾向。

3)在海燕台风期间,用户情感均值为0.206,公众整体情感是积极的,推文情感基本符合正态分布,极端积极和极端消极占比最少,在此次灾害期间用户的情感是相对稳定的。救援物资的需求程度与“人口/距离”成正比,相关性为0.53

论文提出的框架包括数据获取与处理、主题分类、需求词典、地名词典、情感分析等系列模块,可完整的实现救援物资需求和紧迫程度分析。该框架直接从受灾群众处获得真实的救援物资需求,不需要大量历史案例的积累,也不受灾害类型的影响,具有一定的普适性。从公众情感角度分析救援物资的时空紧迫程度,可为灾害应急管理提供辅助参考。

外文摘要:

Natural disasters have caused huge economic losses and casualties. Disaster prevention and mitigation has become a common challenge for all mankind. The distribution of relief supplies is an important part of disaster emergency management. Whether the relief supplies can be accurately and timely distributed to the disaster area is directly related to the efficiency of rescue, and obtaining the actual needs of the people in the disaster area is the most important thing. With the rapid development of information technology, people use social media to post disaster-related tweets during disasters, which provids a new way to obtain disaster information. However, obtaining demand information from massive, unstructured, and random-language social media data faces many challenges. Therefore, this paper constructs an analysis framework for the demand and urgency of relief supplies based on topic models and sentiment analysis. Taking Typhoon Haiyan as an example, it analyzes the demand for relief supplies and the degree of urgency in time and space during Typhoon Haiyan.

In this paper, tweets related to Typhoon Haiyan were firstly obtained based on the method of hashtag screening, and a Bitrem topic model(BTM) was constructed to identify tweets related to demands. In order to extract the demand information of relief supplies from tweets, this paper extracts the high-frequency demand words in tweets, and constructs a demand dictionary of relief supplies through two-word pair expansion, synonym expansion, and local language expansion. In the absence of location information in tweets, this paper builds a three-level place-name dictionary based on administrative division information, extracts location information from tweets, and matches the demand with location information one by one, and then analyzes the demand for relief supplies.

From the perspective of public sentiment, this paper infers the urgency of the demand for relief supplies. Firstly, an algorithm was designed based on TextBlob to calculate the sentiment value of each tweet and divide the sentiment value into positive, negative and neutral polarity. Secondly, conduct a temporal and spatial analysis of the average public sentiment. Combine with the daily trend of sentiment and the classification results of tweets to explore the reasons for sentiment changes in different periods to infer the time urgency of the demand for relief supplies. Calculate the average public sentiment by province to analyze the spatial distribution of the average sentiment value, and infer the spatial urgency of the demand for relief supplies based on the local disaster situation and economic situation.

The results show that:

(1) Based on the BTM, demand dictionary and place-name dictionary, the demand distribution of relief supplies is obtained. Most of the areas in high demand for relief supplies are located near the path of the typhoon, and the demand in the eastern region is higher than that in other regions. Leyte is the region with the highest demand for relief supplies, followed by Cebu and Samar provinces. Through manual inspection, based on the BTM subject classification, the recall rate of the identified demand-related categories is 0.894, and the accuracy rate is 0.72. This method can effectively extract demand-related tweets. The degree of demand for relief supplies in different regions and categories obtained by this method is consistent with the official reports and news reports on Typhoon Haiyan.

 (2) Based on the Textblob to obtain the time and space urgency of the relief supplies demand. November 10 solstice November 13 is the stage of the most depressed sentiment after the disaster, the critical period of disaster relief and the most urgent demand for relief supplies. Samar and Leyte are "high demand-high urgency" areas, which have the most urgent demand for relief supplies, followed by Capiz and Aklan. The accuracy rate, recall rate and F1-Measure values of the TextBlob based emotion calculation method are all over 0.6, which can better identify the sentimental tendency of tweets.

(3) During Typhoon Haiyan, the average value of users’ sentiment was 0.206. The overall public sentiment was positive, and the sentiment of tweets was basically in line with the normal distribution. The extreme positive and extremely negative accounted for the least, and the user's sentiment was relatively stable during the disaster. The demand for relief supplies is proportional to the ratio of population to distance, and the correlation is 0.53.

The framework proposed in this paper includes a series of modules such as data acquisition and processing, subject classification, demand dictionary, place name dictionary, sentiment analysis, etc., which can completely realize the analysis of the demand and urgency of relief supplies. The framework directly obtains the real demand of relief supplies from the disaster-stricken people, does not require the accumulation of a large number of historical cases, and is not affected by the type of disaster, so it has a certain degree of universality. Analyzing the temporal and spatial urgency of relief supplies from the perspective of public sentiment can provide an auxiliary reference for disaster emergency management.

参考文献总数:

 108    

作者简介:

 张婷,灾害与社交媒体大数据分析,发表论文《A topic model based framework for identifying the distribution of demand for relief supplies using social media data》《Temporal and Spatial Evolution and Influencing Factors of Public Sentiment in Natural Disasters—A Case Study of Typhoon Haiyan》《Assessing the Intensity of the Population Affected by a Complex Natural Disaster Using Social Media Data》软著《基于Hashtag的推特推文及用户爬取软件V1.0》《基于BTM的短文本主题分类工具软件V1.0》专利《一种景区路牌配置方法及系统》《智能化旅游景区路牌选址方法和系统》    

馆藏号:

 硕070503/21013    

开放日期:

 2022-06-10    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式