- 无标题文档
查看论文信息

中文题名:

 主观幸福感的层级贝叶斯小域估计——基于CHIPS调查数据    

姓名:

 苏萌    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 071400    

学科专业:

 统计学    

学生类型:

 硕士    

学位:

 理学硕士    

学位类型:

 学术学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 统计学院    

研究方向:

 统计理论及应用    

第一导师姓名:

 段小刚    

第一导师单位:

 统计学院    

提交日期:

 2023-06-19    

答辩日期:

 2023-05-13    

外文题名:

 HIERARCHICAL BAYESIAN SMALL AREA ESTIMATION OF SUBJECTIVE WELL-BEING BASED ON CHIPS SURVEY DATA    

中文关键词:

 小域估计 ; 层级贝叶斯 ; 哈密顿蒙特卡洛 ; NUTS算法 ; 幸福率    

外文关键词:

 Small area estimation ; Hierarchical Bayes ; Hamiltonian Monte Carlo ; NUTS algorithm ; Happiness rate    

中文摘要:

近年来,根据我国国情的需要,我国政府对“多层次推断”的需求不断增加,即通过大型抽样调查得到的数据,在满足总体(如全国)推断需求的同时,也希望能够实现对小域(如省、市、区县)的推断。“多层次推断”的需求意味着希望基于大总体的抽样调查数据得到可靠的小域估计量。因此,本文对小域估计的理论方法及应用进行研究,并应用小域估计方法估计我国各地区的居民幸福率。
首先,对小域估计的基本理论方法进行了系统的梳理与总结,从改进估计方法的角度,将小域估计方法分为基于设计的估计法和基于模型的估计法。基于设计的估计法是一种直接估计法,仅使用落在目标小域内部的样本进行估计。基于模型的估计是一种间接估计方法,借助统计模型从相似小域中“借力”来间接增加目标小域的有效样本量,可以分为隐式模型估计法和显式模型估计法。隐式模型估计主要包括合成估计和组合估计,显式模型根据可用辅助信息的层次主要分为小域层次模型和单元层次模型。对于显式模型估计法,可以使用经验最佳线性无偏预测、经验贝叶斯以及层级贝叶斯方法对小域目标量进行推断。
其次,针对基于模型的层级贝叶斯小域估计方法进行研究,对层级贝叶斯推理方法马尔可夫链蒙特卡洛(MCMC)、哈密顿蒙特卡洛(HMC)以及基于HMC的NUTS算法进行了详细介绍。与传统的MCMC方法如Metropolis-Hastings算法相比,HMC避免了随机游走,可以更高效的探索状态空间,并且对于建议状态具有更高的接受率。NUTS算法进一步克服了HMC需要人为指定算法参数的难题。此外,基于子域层次线性混合效应模型,提出了子域层次logit-normal层级贝叶斯模型,并将其应用于响应变量为二项数据的小域幸福率的估计。
本文选取各地区主观幸福率作为衡量区域居民整体幸福水平的指标,基于中国家庭收入调查项目(CHIP)的调查数据和山东省及各城市统计年鉴数据、2010年人口普查山东省分县资料数据,采用子域层次logit-normal层级贝叶斯模型并利用NUTS算法,对山东省各区县的主观幸福率进行估计。研究结果表明,基于层级贝叶斯模型的小域估计方法,对于样本量很小甚至样本量为零的区域都能得到目标参数的可靠估计。对山东省各区县主观幸福率的实证分析结果表明,沿海地区比内陆地区主观幸福率更高,经济发展水平高的地区居民幸福率一般也会比较高。对于各个城市内部,县级市的主观幸福率一般会比市区大部分区域更高。本文的研究结果也进一步论证了小域估计方法尤其是层级贝叶斯小域估计在我国社会统计等领域的适用性。

外文摘要:

In recent years, according to the needs of China's national situation, the Chinese government's demand for “multi-level inference” has been increasing, that is, the data obtained through large-scale sampling surveys, while meeting the overall (such as the national) inference needs, also hope to be able to achieve small areas (such as provinces, cities, districts and counties) inference. The need for “multi-level inference” means that it is desirable to obtain reliable small area estimators based on large-scale sample survey data. Therefore, this paper studies the theory and application of small area estimation, and applies the small area estimation methods to estimate the happiness rate of residents in various regions of China.
Firstly, the basic theoretical methods of small area estimation are systematically sorted out and summarized. From the perspective of improving estimation methods, small-area estimation methods are divided into design-based methods and model-based methods. The design-based estimation is a direct estimation method, which only uses samples that fall within the target small area. Model-based estimation is an indirect estimation method, which indirectly increases the effective sample size of the target small area by “borrowing strength” from similar small areas with the help of statistical models. It can be divided into estimation based on implicit model and estimation based on explicit model. Implicit model estimation mainly includes composite estimation and combined estimation. Explicit model is mainly divided into area level model and unit level model according to the level of available auxiliary information. For the explicit model estimation, empirical best linear unbiased prediction, empirical Bayes and hierarchical Bayes methods can be used to infer small area target quantities.
Secondly, the model-based hierarchical Bayesian small area estimation method is studied. The hierarchical Bayesian inference methods Markov Chain Monte Carlo (MCMC), Hamiltonian Monte Carlo (HMC) and NUTS algorithm are introduced in detail. Compared with traditional MCMC methods such as Metropolis-Hastings algorithm, HMC avoids random walk, can explore the state space more efficiently, and has a higher acceptance rate for the proposed state. NUTS algorithm further overcomes the problem that HMC needs to specify algorithm parameters manually. In addition, based on the sub-area level linear mixed effect model, a sub-area level logit-normal hierarchical Bayesian model is proposed and applied to the estimation of the small area happiness rate with binomial response variables.
The subjective well-being rate of each region is selected as an indicator to measure the overall well-being level of regional residents. Based on the survey data of China Household Income Project (CHIP), the data of Shandong Statistical Yearbook and statistical yearbooks of cities, and the data of counties in Shandong Province in the 2010 census, the subjective well-being rate of each district and county in Shandong Province is estimated by using the sub-area level logit-normal hierarchical Bayesian model and NUTS algorithm. The results show that the small area estimation method based on hierarchical Bayesian model can obtain reliable estimation of target parameters for areas with small sample size or even zero sample size. The empirical analysis of the subjective happiness rate of districts and counties in Shandong Province shows that the subjective happiness rate of coastal areas is higher than that of inland areas, and the happiness rate of residents in areas with high economic development level will generally be higher. For each city, the subjective happiness rate of county-level cities is generally higher than that of most urban areas. The research results of this paper also further demonstrate the applicability of small area estimation methods, especially hierarchical Bayesian small area estimation, in the field of social statistics in China.

参考文献总数:

 93    

馆藏号:

 硕071400/23006    

开放日期:

 2024-06-18    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式