中文题名: | 基于不同方法的多变量高频气温预测比较研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 025200 |
学科专业: | |
学生类型: | 硕士 |
学位: | 应用统计硕士 |
学位类型: | |
学位年度: | 2024 |
校区: | |
学院: | |
研究方向: | 数据科学与管理 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2024-06-14 |
答辩日期: | 2024-05-25 |
外文题名: | A COMPARATIVE STUDY OF MULTIVARIATE HIGH-FREQUENCY TEMPERATURE FORECASTS BASED ON DIFFERENT METHODS |
中文关键词: | |
外文关键词: | Temperature prediction ; Time-series data ; High-frequency ; Prophet model ; LSTM model |
中文摘要: |
在全球气候变暖、极端天气频发的大背景下,准确地预测气象变化可以极大地便利人们的生活。而气温作为一个非常重要的气象因素,实现气温的高频预测具有非常直接的现实意义,例如高频气温预测可以用于实时监测和预测极端高温事件,为灾害预警和应对提供关键信息;还可以为农民提供实时的种植建议,包括选择适宜的播种时间、调整作物品种等,以应对气候变化带来的挑战等。气温的变化对各行各业都有非常重要的影响。 |
外文摘要: |
Against the background of global warming and frequent occurrence of extreme weather, accurate prediction of meteorological changes can greatly facilitate people's lives. As a very important meteorological factor, the realization of high-frequency prediction of temperature has very direct practical significance. For example, high-frequency temperature prediction can be used for real-time monitoring and prediction of extreme high temperature events, which can provide key information for disaster warning and response; it can also provide real-time planting advice for farmers, including choosing appropriate sowing time and adjusting crop varieties to meet the challenges brought by climate change, and so on. Changes in temperature have a very important impact on all sectors. However, high-frequency temperature prediction remains a complex and challenging task because temperature changes are affected by various factors such as general circulation, oceanic activities, topography and geomorphology. Comparative study of the effects of different methods in high-frequency temperature prediction can help us understand the advantages and disadvantages of various methods more deeply and provide more reliable theoretical support for the actual prediction work. Therefore, this paper mainly focuses on the prediction of multivariate high-frequency temperature data based on different methods, aiming to find a model suitable for high-frequency temperature prediction with good application ability by mining the intrinsic information of meteorological data and analyzing the model comparison as follows: (1) Processing and analyzing the Jena meteorological dataset, including data preprocessing, data visualization, and screening and construction of characteristic variables. In this paper, the Jena meteorological dataset is first pre-processed with duplicates, missing, anomalies, and format conversion; then the trend of multivariate meteorological data and the correlation of the variables are visualized and analyzed, and the potential features of the variables are mined by using line graphs, heat diagrams, violin diagrams, and other visualization methods. In terms of feature screening and construction, this paper, on the one hand, carries out meteorological feature screening based on XGBoost to rank the importance of features; on the other hand, the temperature data are disassembled in time series based on STL, and then AR and Prophet are utilized to fit the different parts of the construction of the time series features about the trend and seasons respectively, and finally all the features of the data are integrated. (2) Based on the processed Jena, Germany meteorological dataset, different models are utilized for high-frequency temperature prediction and the prediction effect is analyzed. In this paper, Prophet, XGBoost, univariate and multivariate LSTM, and Prophet_LSTM combination models are mainly used for prediction, and the XGBoost model also distinguishes whether to add the newly constructed features or not, and analyzes the prediction effect of different models for high-frequency air temperature by comparing the fitting effects of MAE, MSE, and RMSE of different models' prediction results. The prediction effect of different models for high-frequency temperature is analyzed by comparing the fitting effects of MAE, MSE and RMSE of different models. Finally, the prediction results show that univariate LSTM and XGBoost with neotectonic variables are more effective in predicting high-frequency temperatures in Jena. (3) To further investigate the prediction effects of univariate LSTM and multivariate XGBoost models on other high-frequency meteorological datasets, experimental predictions are made based on the hour-by-hour meteorological data of Daxing District, Beijing, for the year of 2023, which are collected from the ECMWF. After data preprocessing and analysis, this paper finds that the univariate LSTM and multivariate XGBoost models still have good prediction effects on the temperature data of Beijing, but the LSTM model is more effective in predicting the hour-by-hour temperature of Beijing, and it can be applied more effectively, which can be used as a reference for the prediction of other high-frequency datasets in the future. |
参考文献总数: | 43 |
馆藏地: | 总馆B301 |
馆藏号: | 硕025200/24032Z |
开放日期: | 2025-06-14 |