- 无标题文档
查看论文信息

中文题名:

 基于数据挖掘技术的多因子选股模型研究    

姓名:

 原浩然    

学科代码:

 025200    

学科专业:

 应用统计硕士    

学生类型:

 硕士    

学位:

 理学硕士    

学位年度:

 2015    

校区:

 北京校区培养    

学院:

 数学科学学院    

研究方向:

 应用统计    

第一导师姓名:

 金蛟    

第一导师单位:

 北京师范大学统计学院    

第二导师姓名:

 吕光明    

提交日期:

 2015-06-04    

答辩日期:

 2015-05-10    

外文题名:

 A MULTI-FACTOR MODEL FOR STOCK PORTFOLIO BASED ON DATA MINING TECHNOLOGY    

中文摘要:
1990年上海证券交易所和深圳证券交易所成立标志着我国资本市场规范化的开端,经过25年的发展取得了一系列令人瞩目的成就。同时互联网的发展为海量上市公司数据的交流和获取提供了便捷的渠道。A股市场的赚钱效吸引着越来越多的投资者进入资本市场,机构投资者和个人投资者都开始尝试数据挖掘模型模拟和预测市场运行,进而开发属于自己的投资策略。本文就如何利用量化模型选择具有投资价值的股票这个问题展开研究,通过选择有效的因子并基于推广的累积logit模型构建了一个选股模型,结合统计检验的方法对模型的有效性进行检验,以求帮助投资者找出最有价值的投资组合。本文研究的备选股票池是Wind行业分类中的工业机械行业所有上市公司,时间区间为2009年8月底到2014年12月底。首先剔除了到目前为止上市不足两年的上市公司,再对每家上市公司共5类(财务成长类、财务质量类、分析师预测类、估值类和量价类)25个指标的月度数据按从大到小的顺序分为1-5档,结合各指标不同档各月最终收益率情况判断该指标是否能通过因子有效性检验。然后对通过因子有效性检验的因子进行关联规则筛选,寻找与收益率之间最具有因果关系的因子,最终留下5类10个因子,分别是:营业收入同比增长率(单季)、净利润同比增长率(单季)、总资产周转率、现金到期债务比、ROE(当期)、评级调整次数、PS(TTM)、PEcut(TTM)、PB、一个月换手率。然后将以上因子2009年8月31日至2013年12月31日的样本作为训练集纳入推广的累积logit模型,其中净利润同比增长率(单季)、ROE(当期)、评级调整次数、PB、一个月换手率5类5个因子通过模型显著性检验。利用选股模型计算每支股票各月获得显著超越基准收益率的概率并按降序排列,选择各月前10支股票等权重配置构建投资组合。将2014年1月31日到2014年12月31日的数据作为测试集进行样本外检验,同样选出各月收益率超越基准概率最大的10只股票作为当月投资组合。本文以工业机械行业指数和沪深300指数为市场基准检验选股模型有效性。发现上述投资组合的收益评价指标和风险评价指标均好于市场基准。在进行简单的组合规模研究后发现,在组合规模为5只股票时,持有期累积净值和夏普指数达到最大,建议投资者利用该模型构建投资组合的持仓规模为5只股票。通过历史数据证明,使用本文所述模型构建投资组合是有意义的,从而为股票投资者提供了具有实战价值的量化选股方法。
外文摘要:
China's capital market became more and more mature and obtained significant achievements since the Shanghai Stock Exchange and Shenzhen Stock Exchange were founded in 1990. The popularity of Internet provides a convenient way for acquiring and using a large amount of datum of listing corporations at the same time. The “money-making effect” attracts more and more investors to enter the capital market. Institutional investors and individual investors all have begun to try to build their own investment strategies to mimic and predict the working of capital market by using variable kinds of datum. This article mainly does the research on how to make a quantitative model to choose valuable stocks. The quantitative model is built by selecting the effective factors and based on cumulative logit model.The stock pool of this research contains all listing Corporation in machinery industry classified by Wind, and the time interval from the end of August, 2009 to the end of December, 2014. Firstly, remove the listing Corporations listed for less than two years so far. Then, divide all the 25 factors which belong to five categories (financial development, financial quality, analysts’ prediction, valuation and price) into 5 levels according to the order from large to small. In addition, test the availability of every factor by observing whether the yields of different levels are discriminating. At last use Association Rules to find ten efficient factors with causal relationship from the available factors before. The ten efficient factors are the operating income year-on-year growth rate, net profit year-on-year growth rate (single season), total asset turnover ratio, cash debt ratio, ROE, rating adjustment, PS (TTM), PEcut (TTM), PB and a monthly turnover rate.Build a cumulative logit model by using the datum of ten efficient factors from the end of August, 2009 to the end of November, 2013 as the training set of samples, and the coefficients of the net profit year-on-year growth rate (single season), ROE, PB, rating adjustments and a month turnover rate finally passing test of significance. Then calculate the probability of each stock significantly beyond the benchmark rate of return by this quantitative model and rank these stocks according to the order of probability from large to small in each month. Select the top 10 stocks to build a portfolio with the same weight allocation. Put the datum from the end of December, 2013 to the end of December, 2014 as a test set for out of sample test into model, and do the same thing as above.This article finds that the rate of return and risk control of the portfolio built by the cumulative logit model are all better than the market benchmark which are the industrial machinery index and the Shanghai and Shenzhen 300 index. And this article also finds the portfolio has maximum net value and Sharp Ratio when contains 5 stocks in the portfolio. Through testing the historical datum, this article proves that using this model to build investment portfolio is available.
参考文献总数:

 37    

馆藏号:

 硕025200/1522    

开放日期:

 2015-06-04    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式