- 无标题文档
查看论文信息

中文题名:

 基于集成学习的精细化人口时空分布研究——以中国北京市为例    

姓名:

 包文轩    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 081602    

学科专业:

 摄影测量与遥感    

学生类型:

 硕士    

学位:

 工学硕士    

学位类型:

 学术学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 地理科学学部    

研究方向:

 人口空间化    

第一导师姓名:

 宫阿都    

第一导师单位:

 地理科学学部    

提交日期:

 2023-06-12    

答辩日期:

 2023-05-28    

外文题名:

 Fine-Grained Population Spatiotemporal Distribution Based on Ensemble Learning: A Case Study of Beijing, China    

中文关键词:

 集成学习 ; 百度热力图 ; 人口空间化 ; 空间降尺度 ; 动态人口分布    

外文关键词:

 Ensemble Learning ; Baidu Heat Map ; Population Spatialization ; Spatial Downscaling ; Dynamic Population Distribution    

中文摘要:

      准确并且具有高时空分辨率的人口分布数据对于公共卫生、城市规划和灾害管理等许多应用领域具有重要的价值。然而,由于对复杂人类活动模式的了解有限,大面积绘制此类数据仍然是一项具有挑战性的任务。此外,随机森林模型是人口空间化研究中使用最广泛的模型。然而,由于随机森林模型对特征变量的理解有限以及人口空间化问题的复杂性,仍然缺乏准确绘制人口空间分布的可靠模型。本研究针对上述问题,通过集成学习算法Stacking构建了一种集成式人口空间化模型,并基于分区密度建模法结合百度热力图数据,提出了一种人口分布空间降尺度框架,并生成了北京市常规工作日的高时空分辨率(每小时,100m)人口密度分布图。此外,本文分析了人口空间分布的归因以及城市人口密度分布的时空特征。本文的主要研究内容及结论如下:
(1)基于集成学习的人口空间化模型构建
       本研究通过集成学习算法Stacking集成了GBDT、XGBoost、LightGBM和SVR,构建了一种集成式人口空间化模型GXLS-Stacking,并将社会经济数据和自然环境数据与人口普查数据相结合,以训练GXLS-Stacking模型并生成了北京市2020年空间分辨率为100m的高精度格网化人口密度数据集(环境人口数据集)。最后,使用社区户籍数据在像元级进行精度验证。结果表明,GXLS-Stacking模型能够高精度地预测人口的空间分布(R2 = 0.8004,MAE = 34.67人/公顷,RMSE = 54.92人/公顷);与自然环境特征相比,城市的社会经济特征更能表征人口空间分布和人类活动强度。此外,基于生成的高精度格网化人口密度分布图可以用于分析大城市人口密度的空间格局。
(2)基于热力图的动态人口分布制图方法
       本研究通过整合GXLS-Stacking模型生成的环境人口数据、夜间灯光数据和建筑物体积数据,并基于分区密度建模法,构建了一种针对百度热力图数据在工作时间和睡眠时间的空间降尺度框架,并绘制了北京市高时空分辨率的人口密度分布图。最后,本研究使用手机信令数据进行了精度验证。相关结果表明,本文提出的空间降尺度框架具有较高的准确性;北京市在常规工作日内的人口密度分布呈现出“白天向心集中,夜间离心分散”的时空变化特征;居民的活动目的与城市空间功能差异的相互作用导致了人口密度分布的时空演化;中国的“精准防控、动态清零”政策得到了有力实施,最大程度上确保了人民的生命安全和行动自由。此外,该空间降尺度框架可以转移到其他区域,这对于政府的应急响应和人类对环境问题的风险研究具有重要价值。

外文摘要:

        Accurate population distribution data with high spatiotemporal resolution is valuable for many applications such as public health, urban planning and disaster management. However, mapping such data over large areas remains a challenging task due to the limited understanding of complex human activity patterns. In addition, the random forest model is the most widely used model in population spatialization studies. However, due to the limited understanding of the feature variables in random forest model and the complexity of the population spatialization problem, reliable models for accurately mapping the spatial distribution of the population are still lacking. In this study, an integrated population spatialization model is constructed by ensemble learning algorithm Stacking, and based on the dasymetric modeling method combined with Baidu heat map data, a spatial downscaling framework for population distribution is proposed and a high spatiotemporal resolution (i.e., hourly, 100 m) population density distribution map for regular weekdays in Beijing is generated. In addition, this paper analyzes the attribution of population spatial distribution and the spatiotemporal characteristics of urban population density distribution. The main research contents and conclusions of this study are as follows:
(1) Construction of population spatialization model based on ensemble learning
        This study integrates GBDT, XGBoost, LightGBM and SVR using the ensemble learning algorithm Stacking to construct an integrated population spatialization model GXLS-Stacking, and integrates socioeconomic data and natural environmental data with a combination of census data to train the GXLS-Stacking model to generate a high-precision gridded population density dataset (i.e., ambient population dataset) with a 100 m spatial resolution for Beijing in 2020. Finally, precision verification is performed at the pixel level using community household registration data. The results demonstrate that the GXLS-Stacking model exhibits exceptional precision in predicting the spatial distribution of the population (R2 = 0.8004, MAE = 34.67 persons/hectare, RMSE = 54.92 persons/hectare); Compared to the natural environmental features, a city’s socioeconomic features are more capable in characterizing the spatial distribution of the population and the intensity of human activities. In addition, the high-precision gridded population density distribution map generated based on GXLS-Stacking model can be used to analyze the spatial pattern of population density in large cities.
(2) A dynamic population distribution mapping method based on heat map
        By integrating the ambient population data generated by GXLS-Stacking model, nighttime light data and building volume data , and based on the dasymetric modeling method, this research constructed a spatial downscaling framework for Baidu heat map data during work time and sleep time, and mapped the population density distribution map of Beijing with high spatiotemporal resolution. Finally, this study uses mobile signaling data for precision verification. The pertinent findings demonstrate that this research proposed spatial downscaling framework for both work time and sleep time has high accuracy; the distribution of the population in Beijing on a regular weekday shows “centripetal centralization at daytime, centrifugal dispersion at night” spatiotemporal variation characteristics; the interaction between the purpose of residents’ activities and the spatial functional differences leads to the spatiotemporal evolution of the population distribution; China’s “surgical control and dynamic zero COVID-19” epidemic prevention and control policy was strongly implemented. In addition, the spatial downscaling framework can be transferred to other regions, which is of value for governmental emergency responses and for studies about human risks to environmental issues.

参考文献总数:

 140    

馆藏号:

 硕081602/23015    

开放日期:

 2024-06-12    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式