中文题名: | 移动游戏市场表现数据研究——基于机器学习算法 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 025200 |
学科专业: | |
学生类型: | 硕士 |
学位: | 应用统计硕士 |
学位类型: | |
学位年度: | 2023 |
校区: | |
学院: | |
研究方向: | 数理经济统计与大数据管理 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2023-06-29 |
答辩日期: | 2023-05-19 |
外文题名: | Research on Mobile Game Market Performance Data——Based on Machine Learning |
中文关键词: | |
外文关键词: | Mobile Games ; Emotional Analysis ; Decision Tree ; Random Forest ; XG-boost |
中文摘要: |
伴随智能手机的性能提高和移动网络技术的日益发达,移动手机游戏发展迅猛,然而快速的发展往往也意味着竞争的加剧。面对日益激烈的行业环境,各移动游戏厂对竞品分析愈发重视。竞品分析的基础是竞品筛选,然而当下移动游戏行业的竞品筛选尚停留在通过人工筛选的阶段,相对耗时耗力,并不高效。基于以上背景,本文旨在探究统计学方法在移动游戏行业竞品分析中的应用场景, 本文利用机器学习中的分类模型探究移动游戏市场表现数据的影响因素。本文从Sensor Tower数据库中获取了共计89013款在App Store上线的移动游戏产品数据,以月度为时间单位,运用波士顿矩阵法,将每月游戏产品的月度下载量和环比下载增长率转换为明星、金牛、瘦狗、问题四种产品类型作为因变量,再按照游戏自身属性、时间、出品与发行、付费情况、游戏评分将构建的影响因素体系作为自变量,模型部分分别采用决策树模型、随机森林模型和XG-boost模型对数据集进行了调参和建模。建立决策树模型时,绘制了模型的最大深度折线图,选择了出现过拟合现象之前预测准确率最高的参数;对于随机森林和XG-boost模型则使用网格调参法进行参数设定和建模。建模完成后,分别计算出了三种分类模型的混淆矩阵,再基于混淆矩阵的结果计算出模型的查准率、查全率以及F1分数,评价模型的建模效果。最后通过三种模型对影响因素指标的重要性得分,得到对移动游戏产品市场表现数据影响最显著的特征,并结合实际给出移动游戏厂商竞品分析的可行性建议。 本文的创新之处主要体现在选题视角新颖、技术路线的选择具有创新性,以及得到了具有现实意义的研究结果。最后得出的结论是:XG-boost模型的综合预测效果最佳;随机森林模型的总体预测准确度最高;决策树模型的预测效果则较为一般。根据三个分类模型对不同变量的重要性得分排序,综合认为应用安装包大小、游戏评分、当期月份、玩法类型、美术风格、题材设定6项影响因素较为重要,可作为移动游戏竞品筛选的预测性指标。 |
外文摘要: |
Mobile gaming is developing rapidly with the improvement of smartphone and the increasing development of mobile network technology. However, rapid development often means intensified competition. Faced with the increasingly fierce industry environment, the mobile game industry pays more attention to competitor analysis. The foundation of competitor analysis is competitor screening. Nevertheless, competitor screening in the mobile gaming industry now is still relying on manual operation, which is time-consuming and inefficient. Based on the above, this article aims to explore the application of statistical methods in competitor analysis in the mobile gaming industry. This article uses classification models in machine learning to explore the influencing factors of mobile game market performance data. This article obtained data of 89013 mobile games launched in the App Store from the Sensor Tower database, using the Boston matrix method to convert the monthly download volume and month on month download growth rate of game products into four product types: Star, Cash Cow, Thin Dog, and Problem. The product types converted are regarded as the dependent variable. Then, the article constructs the influencing factor system in terms of the game's own attributes, time, production and distribution, payment status, and game rating which is considered as the independent variable of the study. The model part adopts the decision tree model, random forest model and XG boost model. When establishing the decision tree model, we drew the maximum depth line chart of the model, selecting the parameters with the highest prediction accuracy before overfitting as the final parameter of the decision tree model; For random forest and XG boost models, we use the grid parameter adjustment method for parameter setting and modeling. After the modeling is completed, the confusion matrix of the three classification models is calculated respectively, and then the precision, recall and F1 scores of the model are calculated based on the results of the confusion matrix to evaluate the modeling effect of the model. Finally, the article the article identifies the most significant characteristics that have a significant impact on the market performance data of mobile game products based on the importance scores of three models on influencing factor indicators, and provides feasible suggestions for competitive analysis of mobile game manufacturers in light of the actual situation. The innovation of this article is mainly reflected in the innovative perspective of the topic selection, the innovative choice of technical routes, and the obtained research results with practical significance. The conclusion of the article is that the XG-boost model has the best comprehensive prediction effect; The overall prediction accuracy of random forest model is the highest; The decision tree model has relatively poor prediction performance compared to the random forest model and the XG-boost model. According to the importance score ranking of different variables based on three classification models, it is comprehensively believed that six influencing factors are more important including application installation package size, game rating, current month, game genre, game art style, and game setting, which can be used as predictive indicators for mobile game competitor selection. |
参考文献总数: | 44 |
作者简介: | 朱芃睿 北京师范大学应用统计2021级研究生 |
馆藏号: | 硕025200/23038 |
开放日期: | 2024-06-29 |