- 无标题文档
查看论文信息

中文题名:

 小样本场景下地块级水稻产量估算模型构建与应用    

姓名:

 李安祺    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 082506T    

学科专业:

 资源环境科学    

学生类型:

 学士    

学位:

 理学学士    

学位年度:

 2024    

校区:

 北京校区培养    

学院:

 地理科学学部    

第一导师姓名:

 陈晋    

第一导师单位:

 地理科学学部    

提交日期:

 2024-05-26    

答辩日期:

 2024-05-09    

外文题名:

 A few-shot model of yield estimation for individual paddy field    

中文关键词:

 水稻产量 ; 产量估算模型 ; 小样本 ; 样本合成法    

外文关键词:

 Rice yield ; Yield estimation model ; Few-shot ; Sample synthesis    

中文摘要:

  全球气候变化导致的温度与降水变化使得作物的生长发育受到影响,及时、准确地估计农作物产量可推动保障粮食安全。水稻是世界一半以上人口的主食。遥感技术在水稻产量估算领域具有高时效性、高精度等特点,其中经验模型通过建立产量与特征之间的线性或非线性关系实现产量估算,不需要过多的物候学知识作为支撑,得到广泛应用。然而,目前地块级别估产模型对于作物长势空间异质性的考虑尚且不充分,缺少田块尺度的产量估算特征集选取方面的研究,且现有模型需要较多的野外实测样本,费时费力。

  研究以江苏省常熟市为研究对象,收集了2023年的Sentinel-1、Sentinel-2遥感数据,谷歌地球影像,水稻地块实测产量数据,在依托GEE平台完成数据预处理的基础之上,构建植被指数时间序列并提取水稻地块。使用直方图刻画了地块作物长势空间异质性,构建了地块级水稻产量估算的特征集,并基于合成控制法思想,结合随机森林回归、支持向量回归和极端梯度提升树的机器学习算法,构建样本合成方法,进行地块尺度考虑作物长势空间异质性的水稻产量估算。采用水稻地块单位面积产量实测数据对产量估算结果进行交叉验证。结果表明:

1.水稻地块的直方图信息可反映出地块内部作物长势的空间异质性。地块像元NDPI最大值与抽穗期的像元直方图信息与产量的相关性较好。

2.水稻地块的NDPI均值时间序列、收获系数、抽穗期NDPI与像元直方图信息可作为产量估算的特征。

3.样本合成法可以在较高的精度水平上实现水稻地块产量的预测,县域上的产量合成精度达到91.18%。其中:

(1)在不同回归方法当中,线性模型表现出了较好的效果,最优的模型能够在较高的精度上实现目标地块的产量合成,rRMSE达到8.04%,R2达到0.53。

(2)在不同的抽样方法当中,使用拉丁超立方和分层抽样的抽样方法可以避免模型对于选取产量最值地块的依赖,取得较好的样本合成精度。

(3)总体上而言,贡献池样本数量越多,样本合成的效果越好,在样本数量达到7个时,模型的精度基本稳定。

(4)在不同植被指数当中,NDPI对于产量合成的效果总体上优于其他指数,对于不同的植被指数而言,依然是线性模型的效果最佳。

  研究将有助于实现及时、准确的农作物产量监测,为合理制定农业生产政策和粮食安全发展计划提供科学依据。

外文摘要:

  The alteration of temperature and precipitation caused by global climate change has exerted significant impacts on the growth and reproduction of crops. Timely and accurate estimation of crop yield can help to ensure food security. Rice is the staple food of more than half the world's population. Remote sensing technology has the characteristics of high efficiency and accuracy in crop yield estimation. Among methods for yield estimation, it is the empirical model which has been widely used, by establishing linear or nonlinear relationship between yield and features with little phenological knowledge as support. However, the spatial heterogeneity of crop growth has not been fully considered in plot level yield estimation model. The researches on the selection of feature set at the plot scale is not adequate. The existing models need amounts of field samples, which is time-consuming and laborious.

  Experiments are carried out in Changshu, in Jiangsu Province, based on the collection of Sentinel-1, Sentinel-2 product of the year 2023, Google Earth images and paddy field yield data. Remotely sensed data are preprocessed on the GEE platform to construct vegetation indices time series and extract paddy fields. The histograms are used to depict the spatial heterogeneity of crop growth. Then, construct a feature set of rice yield estimation at plot scale. Based on the idea of Synthetic Control Method, combining the machine learning algorithm of random forest regression, support vector regression and extreme gradient boosting regression, the sample synthesis method is constructed to estimate the rice yield considering the spatial heterogeneity of crop growth. The yield sample data of paddy field yield per unit area is used to cross-verify the performance of model. The results are as follow:

1.The spatial heterogeneity of crop growth within the plot can be represented by the histogram information of the rice plot. The histogram information of maximum NDPI of pixels and NDPI during heading date is well correlated with yield.

2.The NDPI mean time series, harvest index, NDPI during heading date and the histogram information of the pixels in plots can be used as features for yield estimation.

3.The sample synthesis method can predict the yield of rice plots at a high accuracy. The synthesis accuracy at the county level reached 91.18%, where:

(1)Among the different regression methods, the linear model has the highest accuracy and robustness. The optimal model could achieve the yield synthesis of the target field with high accuracy, with a rRMSE of 8.04%, R2 of 0.53.

(2)For different sampling methods, the Latin Hypercube Sampling and stratified sampling can avoid the dependence of the model on the sample of maximum and minimum yield and achieve better accuracy of sample synthesis.

(3)The more the number of samples in donor pool, the better performance the sample synthesis has. The accuracy of the model is basically stable when the number of samples reached 7.

(4)Among different vegetation indices, NDPI is generally better than others. For different vegetation indices, the linear model still works the best.

  The research will help to achieve timely and accurate monitoring of crop yield, providing scientific suggestions for the rational formulation of agricultural production policies and food security development plans.

参考文献总数:

 76    

插图总数:

 20    

插表总数:

 7    

馆藏号:

 本082506T/24005    

开放日期:

 2025-05-27    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式