查看论文信息

查看全文

查看论文信息

中文题名：	显著性引导的弱标注遥感影像感兴趣区域提取方法研究
姓名：	吕欣然
保密级别：	公开
论文语种：	chi
学科代码：	081203
学科专业：	计算机应用技术
学生类型：	博士
学位：	工学博士
学位类型：	学术学位
学位年度：	2024
校区：	北京校区培养
学院：	人工智能学院
研究方向：	遥感影像处理
第一导师姓名：	张立保
第一导师单位：	人工智能学院
提交日期：	2024-06-20
答辩日期：	2024-05-28
外文题名：	RESEARCH ON REGION OF INTEREST IN WEAKLY LABELED REMOTE SENSING IMAGES GUIDED BY SALIENCY ANALYSIS
中文关键词：	遥感影像处理 ; 深度学习 ; 弱标注 ; 显著性分析 ; 感兴趣区域提取
外文关键词：	Remote sensing image processing ; Weak annotation ; Deep learning ; Saliency analysis ; Region of interest extraction
中文摘要：	︿随着遥感技术的飞速发展，所获取的影像数据愈发丰富，基于深度学习的感兴趣区域（ROI: Region of Interest）提取业已成为遥感影像处理的重要研究方向。当前，“深度学习+海量遥感数据+强标注”模式，已经取得了显著成果，但由于其性能高度依赖于像素级精确标注，因此，存在如下突出问题：第一，强标注模式需要人工对每一个像素进行精确标注，这不仅耗时耗力，而且成本高昂。第二，面对多源的高分辨率遥感影像，精确标注需要庞杂的领域专家知识，标注的难度和工作量成倍增加。鉴于以上问题，已有部分学者将目光转向了弱标注模式。目前常用的弱标注模式为图像级标注，以伪标签为纽带，通过推断图像像素点的类别标签，建立目标类别与图像像素间的对应关系。与强标注模式相比，弱标注模式可以大大降低标注成本，提高标注效率。然而，由于相关研究起步较晚，仍存在诸多问题亟待解决。 1）从样本标注角度来看，弱标注提供的指导与监督信息较少，且标注过程中容易出现错标与漏标现象，削弱了ROI提取算法的决策能力； 2）从输入数据角度来看，遥感影像质量易受到天气情况影响，导致影像对比度下降，目标区域边界模糊，严重制约了弱标注条件下ROI提取的精度； 3）从算法性能角度来看，弱标注条件下的深度学习模型往往表达能力不足，通常需要繁琐的后处理步骤，导致测试与输出效率低下。显著性分析是受人眼视觉注意机制启发所提出的理论方法，能够快速选取若干显著区域进行优先判读，从而避免对整幅影像采用耗时的全图分析。将显著性分析与深度学习相结合，能够有效提升网络的学习与表达能力，有助于实现弱标注条件下遥感影像ROI的准确提取，对于缓解遥感影像高速获取与低速解译之间的矛盾具有重要意义。针对上述问题，本文从逐步提升标签质量入手，结合显著性分析和语义特征感知，分别对弱标注的不确切标注、不准确标注和不完全标注三种数据集条件，设计不同的阶段性处理模型，在弱标注条件下实现遥感影像的准确、高效感兴趣区域提取，有效提升了测试输出效率。本文的主要研究工作包括： 1）针对不确切标注条件下，算法输出效率低下的问题，提出了一种基于显著特征交叉感知的遥感影像ROI提取方法。该方法主要包含两个阶段：初始像素级伪标签生成和最终语义协同分割。在第一阶段，构建二元分类网络，并设计一种基于多层类别特征注意的显著图计算方法，生成像素级伪标签，实现图像级标注到像素级标注的跨越，为后续ROI提取提供基础。在第二阶段，构建一种基于双分支嵌套U-Net的感兴趣区域提取网络。该网络利用输入卷积层提取局部特征，并通过多尺度特征融合模块计算不同尺度的信息。此外，在网络中嵌入显著特征交叉感知模块，使得模型能够充分感知图像间的特征信息，从而实现图像对的感兴趣区域协同分割。这种方法旨在增强网络的特征提取能力，减少细节丢失。 2）针对输入数据易受天气影响而质量下降的问题，提出了一种基于显著特征对比分析的含雾遥感影像感兴趣区域提取方法。该方法主要包括三个部分：初始伪标签的生成，伪标签修正，和含雾遥感影像感兴趣区域提取。在第一部分中，我们构建分类网络，使用类特征感知，多层特征融合的方法得到初始像素级伪标签。在第二部分中，设计显著特征对比度相似性排序算法，降低雾霾噪声影响，对初始像素级伪标签进行修正。在第三部分中，提出基于无监督领域自适应的感兴趣区域提取方法，调整目标域熵分布以使之与源域熵分布相似，使网络同时适应清晰图像和含雾图像，提升模型的泛化能力和鲁棒性。 3）针对不准确标注削弱了遥感影像ROI提取模型决策能力的问题，提出了一种基于标签噪声清洗和显著特征不确定性分析的遥感影像ROI提取方法。该方法主要分为两部分：标签噪声清洗与伪标签生成和基于显著特征不确定性分析的感兴趣区域提取。在第一部分中，通过迭代学习实现标签噪声清洗并生成像素级伪标签，然后使用多层类别特征注意的初始伪标签显著图生成。在第二部分中，面对遥感影像特征的复杂性和多样性，提出了一种互补显著特征不确定性感知算法。该方法考虑到不同地物间特征的复杂性，综合模型预测的不确定性和数据集之间显著特征的互补性，优化特征的利用率并提升最终的ROI提取结果。 4）针对遥感影像ROI标注不完全且精确度低的问题，提出了一种基于显著特征多线索融合的半监督遥感影像感兴趣区域提取方法。该方法主要分为两个部分：第一部分为遥感影像的多线索显著性分析，目的是为了利用视觉显著性算法对一部分遥感影像实现目标区域定位，生成像素级伪标签。第二部分为基于自适应一致性的半监督遥感影像感兴趣区域提取，目的是将自适应和半监督学习相结合，在不完全标注的条件下完成感兴趣区域提取模型的训练。该方法旨在提高遥感影像感兴趣区域提取的效率和准确性，特别是在面对不完全标注数据集时，有效地缓解了标注数据稀缺的问题。上述工作是显著性分析与深度学习新理论、新方法相结合的一次重要尝试，将为基于深度学习的底层计算机视觉相关研究提供全新研究思路，项目成果对遥感、医学及天文等领域的信息分析与提取具有重要应用价值。﹀
外文摘要：	︿ With the rapid advancement of remote sensing technology, the availability of image data has become increasingly abundant. The extraction of regions of interest (ROI) based on deep learning has emerged as a crucial research direction in remote sensing image processing, providing vital technical support for disaster monitoring, urban planning, military reconnaissance, and other fields. Currently, the "Deep Learning + Massive Remote Sensing Data + Strong Annotation" model has achieved remarkable success in ROI extraction from remote sensing images. However, due to its heavy reliance on pixel-level precise annotations, several prominent issues arise: Firstly, the strong annotation model requires manual precise annotation of every pixel, which is not only time-consuming and labor-intensive but also costly. Secondly, dealing with multi-source high-resolution remote sensing images poses additional challenges due to the complexity and variability of scenes. Precise annotation requires extensive domain expertise, and the difficulty and workload of pixel-level annotation increase significantly. In recognition of these issues with the strong annotation model, some scholars have shifted their focus towards weak annotation models. Currently, the commonly used weak annotation model is image-level annotation, which establishes a correspondence between target categories and image pixels by dynamically inferring the category labels of image pixels using pseudo-labels as a bridge. Compared to the strong annotation model, the weak annotation model can significantly reduce annotation costs and improve annotation efficiency. However, due to its relatively late start, there are still numerous issues that need to be addressed. 1) From the perspective of sample annotation, weak annotations provide limited guidance and supervisory information, and there is a tendency for mislabeling and missed labeling during the annotation process, which weakens the decision-making capabilities of ROI extraction algorithms. 2) From the perspective of input data, the quality of remote sensing images is susceptible to weather conditions, leading to decreased image contrast, blurred boundaries of target areas, and severe constraints on the accuracy of ROI extraction under weak annotation conditions. 3) From the perspective of algorithm performance, deep learning models under weak annotation conditions often lack expressive power and usually require cumbersome post-processing steps, resulting in low testing and output efficiency. Saliency analysis is a theoretical method proposed by mimicking the visual attention mechanism of the human eye. It can quickly select several salient regions in the scene for priority interpretation, thus avoiding time-consuming full-image analysis of the entire image. It is an effective means of rapidly extracting important targets from images. Combining saliency analysis with deep learning can effectively enhance the learning and expression capabilities of the network, helping to achieve accurate extraction of regions of interest in remote sensing images under weak annotation conditions. This is of great significance for alleviating the contradiction between high-speed acquisition and low-speed interpretation of remote sensing images. To address the aforementioned issues, this paper starts by gradually improving the quality of labels, combining saliency analysis and semantic feature perception. Different staged processing models are designed for three types of dataset conditions: inexact annotation, inaccurate annotation, and incomplete annotation. Accurate and efficient ROI extraction from remote sensing images under weak annotation conditions is achieved, effectively improving testing and output efficiency. The main research work of this paper includes: 1) To address the issue of low algorithm output efficiency under inexact annotation conditions, a method for extracting regions of interest from remote sensing images based on cross-perception of prominent features is proposed. This method mainly consists of two parts: initial pixel-level pseudo-label saliency map generation and final semantic collaborative segmentation. In the first stage, a binary classification network is constructed, and a saliency map calculation method based on multi-layer category feature attention is designed to generate pixel-level pseudo-labels, achieving a leap from image-level annotation to pixel-level annotation and providing a foundation for subsequent ROI extraction. In the second stage, a ROI extraction network based on a double-branch nested U-Net is constructed. This network combines multiple residual modules, utilizes the input convolutional layer to extract local features, reduces detail loss through a symmetrical encoder-decoder structure, and fuses information from different scales through a multi-scale feature fusion module. Additionally, a cross-perception module for prominent features is embedded in the network. The design of this module enables the model to fully perceive the cross-feature information between images, thus achieving collaborative segmentation of ROIs for image pairs. This method aims to improve the accuracy of semantic segmentation in remote sensing images, reduce detail loss, and enhance the feature extraction capabilities of the network. 2) To address the problem of input data being susceptible to weather conditions and deteriorating quality, we propose a method for extracting regions of interest from fog-covered remote sensing images based on comparative analysis of salient features. The method mainly consists of three parts: generating initial pseudo-labels, correcting pseudo-labels based on comparative analysis of salient features, and extracting regions of interest from fog-covered remote sensing images based on domain adaptation. In the first part, we construct a classification network and use class feature perception and multi-layer feature fusion methods to obtain initial pixel-level pseudo-labels. In the second part, we design a salient feature comparison similarity sorting algorithm to compare local features between images, reduce the influence of haze noise, and correct the initial pixel-level pseudo-labels. In the third part, we propose an unsupervised domain adaptation method to adjust the entropy distribution of the target domain to make it similar to that of the source domain, thereby indirectly achieving entropy minimization, allowing the network to simultaneously adapt to clear images and fog-covered images, improving the generalization ability and robustness of the model. 3) To address the problem of inaccurate labeling that weakens the decision-making ability of the remote sensing image ROI extraction model, a method for extracting regions of interest from remote sensing images based on label noise cleaning and significant feature uncertainty analysis is proposed. The method mainly consists of two parts: label noise cleaning and pixel-level pseudo label generation, and region of interest extraction based on significant feature uncertainty analysis. In the first part, initial pseudo labels are generated through iterative learning and noise cleaning, and then a multi-layer category feature attention initial pseudo label saliency map is generated, while using superpixel segmentation to preserve image edge details. In the second part, a complementary significant feature uncertainty perception algorithm is proposed for the complexity and diversity of remote sensing image features. The method considers the complexity of features between different ground objects in remote sensing images, integrates the uncertainty of model prediction and the complementarity of significant features between data sets, and introduces uncertainty perception strategies to improve the utilization of features and enhance the final region of interest extraction results. 4) In response to the problem of incomplete and low-precision ROI annotation for remote sensing images, a semi-supervised region of interest extraction method based on multi-cue fusion of salient features is proposed. The method mainly consists of two parts: the first part is multi-cue salient analysis for remote sensing images, aiming to locate target regions using visual saliency algorithms for a subset of remote sensing images and generate pixel-level pseudo labels. The second part is semi-supervised region of interest extraction based on adaptive consistency, aiming to combine adaptive and semi-supervised learning to complete the training of region of interest extraction model under incomplete annotation conditions. This method aims to improve the efficiency and accuracy of region of interest extraction for remote sensing images, especially in the face of incomplete annotation datasets, effectively alleviating the problem of scarce annotation data. The above work is an important attempt to combine significance analysis with new theories and methods of deep learning, which will provide a new research idea for the underlying computer vision research based on deep learning. The project results have important application value for information analysis and extraction in the fields of remote sensing, medicine, astronomy, and other fields. ﹀
参考文献总数：	193
馆藏地：	图书馆学位论文阅览区（主馆南区三层BC区）
馆藏号：	博081203/24003
开放日期：	2025-06-20

附件下载