中文题名: | 基于多模态融合的旅游评论情感分析 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 025200 |
学科专业: | |
学生类型: | 硕士 |
学位: | 应用统计硕士 |
学位类型: | |
学位年度: | 2024 |
校区: | |
学院: | |
研究方向: | 应用统计 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2024-06-15 |
答辩日期: | 2024-05-25 |
外文题名: | Sentiment Analysis of Travel Reviews Based on Multimodal Fusion |
中文关键词: | |
外文关键词: | |
中文摘要: |
多模态情感分析是计算机研究领域备受关注的课题,其涉及的模型种类繁多,本文旨在融合文本、图片和表情符号三种模态信息,分析社交媒体平台上用户的旅游评论,运用并评价三种多模态情感分析模型,希望找到最适合应用于微博上多模态旅游评论的模型,以助力后续旅游研究。 本文将重点放在不同融合方法下的情感分析模型上。具体而言,选取了Concatbert模型、MMBT模型和EDfIT模型三个不同融合方法的模型,同时将多模态情感分析创新性运用到旅游评论数据中,丰富了多模态情感分析的应用。 为此,本文爬取并标注了包含文本、表情符号、图片三种模态,共计13021条关于故宫博物院的微博评论数据,并对其进行清洗和预处理。模型分析结果显示,Concatbert模型是这三个模型中表现得最为均衡的,并且也是效果最好的。MMBT模型准确率最高,但对于样本不均衡和少数类的处理效果并不好。然而需要注意到EDfIT模型可能由于是针对特定数据集所提出的模型,表现最差。相较于其他研究,数据集分布不均衡,数据条数较少,图片和文本、表情符号之间的关联度差,都可能是模型表现变差的原因。 在选择多模态情感分析模型上,当样本数据集较小,情感类别数据量差距较大,每一条文本对应一张图片时,建议优先选择concatbert模型,简单效果好。旅游评论情感分析的结果应用十分广泛。首先,针对消极的评论可结合实际情况,对提升旅游目的地的服务质量提出针对性的建议。其次这些旅游评论包含着游客了解到的旅游目的地形象,可以了解旅游目的地的感知形象。最后可以分析长时间段上积极的旅游评论,了解其发布的时间与出游人数,对景区客流量进行预测。旅游评论情感分析的结果还可运用于事件识别、舆情控制等。 |
外文摘要: |
Multimodal sentiment analysis has attracted much attention in the field of computer research, and it involves a wide variety of models. This paper aims to integrate multiple modal information such as text, pictures, and emoticons, analyze user travel reviews on social media platforms, apply and evaluate three multimodal sentiment analysis models, and hope to find the most suitable model for Weibo multimodal travel reviews, to facilitate subsequent tourism research. This paper focuses on sentiment analysis models under fusion methods. Specifically, the Concatbert model, the MMBT model and the TensorFlow model Emotion Detection from Image and Text project proposed by Erik Koci et al were selected as models with three different fusion methods. At the same time, multimodal sentiment analysis is innovatively applied to tourism review data, which enriches the application of multimodal sentiment analysis in tourism. To this end, this paper extracts and marks a total of 13,021 Weibo comment data about the Palace Museum, including text, emoji and pictures, and cleans and preprocesses them. The results show that Concatbert is the most balanced of the three models, and also the most effective. MMBT model has the highest accuracy, but it is not good for processing unbalanced samples and few classes of self-built data sets. However, it should be noted that the project model proposed by Koci et al. may perform the worst because it is a model proposed for a specific data set. Compared with other studies, the uneven distribution of data sets, the small number of data pieces, and the poor correlation between pictures, text and emoticons may be the reasons for the poor performance of the model. In the selection of multi-modal sentiment analysis model, when the sample data set is small and unbalanced, and each text corresponds to a picture, it is recommended to preferentially choose concatbert model, which has good simple effect, followed by MMBT model. The results of sentiment analysis in travel reviews are widely used. The negative part of the comments can be combined with the actual situation, to improve the quality of tourism destinations to put forward suggestions. Secondly, these travel reviews contain the tourist destination image that tourists know, and they can understand the perceived image of the tourist destination. Finally, it can analyze long-term travel reviews, analyze the release time and the number of tourists to forecast the tourist flow of scenic spots and can be used in event identification and public opinion control. |
参考文献总数: | 45 |
馆藏地: | 总馆B301 |
馆藏号: | 硕025200/24077Z |
开放日期: | 2025-06-15 |