- 无标题文档
查看论文信息

中文题名:

 基于分类和生存分析模型对电信用户流失的研究    

姓名:

 万颖恺    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 025200    

学科专业:

 应用统计    

学生类型:

 硕士    

学位:

 应用统计硕士    

学位类型:

 专业学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 统计学院    

研究方向:

 应用统计    

第一导师姓名:

 李慧    

第一导师单位:

 统计学院    

提交日期:

 2023-06-26    

答辩日期:

 2023-05-19    

外文题名:

 A study of telecoms subscriber churn based on classification and survival analysis models    

中文关键词:

 电信流失 ; 随机森林 ; 生存分析 ; 加速失效时间模型    

外文关键词:

 Telecom churn ; Random forest ; Survival analysis ; Accelerated failure time model    

中文摘要:

随着电信行业的发展进步,电信业务市场已逐步走向饱和,新客市场越来越有限且开拓成本亦越来越高,因此各大运营商需要进一步考虑如何能够提高现有客户粘性、减少客户流失。而在大数据时代下,运营商们可以通过各自内部完善的客户管理系统,构建合理的模型,挖掘客户数据中的潜在信息,如此便能更好地了解客户、明确客户需求,规避客户流失的相关风险,延长客户的生命周期。
鉴于此,本文基于某电信运营商数据从预测流失结果和挖掘流失原因两个角度进行分析探究,一方面帮助运营商识别有流失倾向的客户,另一方面总结风险因子和流失客户特征,并为运营商提出相应的运营优化建议。
首先针对流失结果的预测,本文构建了逻辑回归、支持向量机和随机森林三种分类模型,综合对比三种模型在准确率、召回率等指标上的表现,得出随机森林为本数据集上性能最优的分类器,其准确率和召回率分别为 76.76%和 80.11%,表明该模型不仅能对客户较好分类,更能敏锐地识别流失客户。
其次,本文引入生存分析方法来研究客户流失原因,通过绘制生存曲线和构建加速失效时间模型,追踪到了客户流失情况随时间的变化特征,在分析不同变量对客户流失风险影响的同时,量化各变量下不同水平间的生存差异。结果表明,除客户的个人特征外,运营商所提供的业务才是导致流失的关键原因所在,合同类型、网络服务、月消费水平、支付方式和无纸化账单业务是影响客户流失的重要因素,特别是合同类型和网络服务,选择两年期合同的客户的平均生存时长是选择按月合同客户的 2.991 倍,未开通网络服务客户的平均生存时长是选择光纤线路的客户的 3.851倍。
最后,根据模型结果和分析,从业务价格制定、产品服务策略、合同期限设置、客户画像分类以及流失预警机制这五个方面,提出了运营商后今后业务调整的方向和相关举措。

外文摘要:

As the telecoms industry progresses, the market for telecoms services has gradually become saturated and the market for new customers is becoming increasingly limited and expensive to  develop. Therefore, operators need to further consider how they can improve the stickiness of existing customers and reduce customer churn. In the context of the big data era, operators can  build reasonable models and explore potential information in customer data through their own internal customer management systems, so that they can better understand customers, clarify 
customer requirements, avoid the risks associated with customer churn, and extend the customer life cycle.
Based on a telecom operator's data, this paper analyses and explores two perspectives: predicting churn outcomes and uncovering the causes of churn, helping the operator to identify  customers with a tendency to churn on the one hand,summarizing risk factors and characteristics of churn customers on the other, and providing relevant optimization suggestions for the operator.
Firstly, for the prediction of attrition results, three classification models, namely Logistic Regression, Support Vector Machine and Random Forest, are constructed in this paper. By comparing the performance of the three types of models in terms of accuracy, recall and so on, it was concluded that Random Forest was the best classifier on this dataset with an accuracy and recall of 76.76% and 80.11% respectively, indicating that the model is not only able to classify customers accurately, but also identify churned customers sensitively.
Secondly, this paper introduces a survival analysis approach to study the causes of customer churn. By plotting survival curves and constructing the Accelerated Failure Time model, we trace  the characteristics of customer churn over time, and quantify the survival differences between different levels under each variable while analyzing the impact of different variables on customer  churn risk. The results show that, in addition to customers' personal characteristics, it is the service offered by the operator that is the key cause of churn. Contract, internet service, monthly charges, payment method and paperless billing service are important factors influencing churn, especially contract and internetservice. The average survival time for customers choosing a two-year contract is 2.991 times longer than for customers choosing a month-to-month contract, and the average survival time for customers without internet service is 3.851 times longer than for customers choosing a fiber optic line.
Finally, based on the model results and analysis, the direction of future business adjustments and related initiatives for the operator are proposed in five areas: business price setting, product and service strategy, contract term setting, customer profile classification, and churn warning mechanism.

参考文献总数:

 36    

馆藏号:

 硕025200/23003    

开放日期:

 2024-06-25    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式