中文题名: | 基于遥感数据的单类分类器比较与改进研究 |
姓名: | |
学科代码: | 081405 |
学科专业: | |
学生类型: | 硕士 |
学位: | 工学硕士 |
学位年度: | 2012 |
校区: | |
学院: | |
研究方向: | 遥感分类 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2012-06-05 |
答辩日期: | 2012-05-16 |
外文题名: | Comparing and Improvement of One-Class classifier based on Remote Sensed data |
中文摘要: |
单类分类是指在只有一种类别样本的情况下,只通过这一类的样本训练分类器,再用训练出的分类器对未知类别的新样本进行类别的判断。在遥感影像分类问题中,当某种类别样样本无法获取或是相对于其他样本数量很少时,就会产生各类样本数不平衡的现象,传统的两类或多分类方法将不能很好的适应;或者当应用者只需要从图像中提取某种特定的类别时,标定大量非此类别的样本将花费额外的时间,这时就需要用单类分类器来解决分类问题。因此,研究单类分类器在遥感影像分类中的应用有着重大的意义。本文首先对已有的单类分类方法做了总结和概述,并集中讨论几种典型的单类分类算法,将它们应用于实际的遥感影像进行比较分析实验。证明基于支撑域的单类分类方法OCSVM(one-class SVM)和基于密度的方法GDD均可以完成针对遥感影像的地物分类,而BSVM方法因考虑了更多样本的信息,能够得到更好的分类结果。得出结论,在单类分类问题中,加入未知类别样本的信息,可以提高分类效果。接着引入利用目标类别样本和未知类别样本完成分类的PUL(positive and unlabeled learning)算法,此方法已成功应用于文档分类问题,但是还没有广泛应用于遥感图像的处理。PUL方法有着不需要人为设定参数的优点,文中在PUL算法基础上提出一种改进算法T-PUL,通过实验证明,PUL方法和T-PUL方法可以成功应用于Landsat TM遥感影像分类中,在只有目标类别样本和未标定类别样本的情况下,PUL方法和本文提出的T-PUL方法可以获得优于OC-SVM、BSVM(Biased SVM)方法的分类效果,T-PUL方法在一定程度上又对PUL算法有所改进。这类算法的优势在于,在节省标定样本付出的时间和精力的同时,保证了分类精度。鉴于PUL方法和T-PUL方法在遥感影像单类分类应用中的巨大潜力,本文接下来对PUL方法和T-PUL方法做进一步的研究,通过实验探讨训练样本数量、未标定类别样本组成结构、目标类别不同对PUL、T-PUL方法分类效果的影响。总的来说,更多的目标类别样本和包含更多非目标类别样本的未知类别样本更有助于获取高的分类精度,然而在分类过程中,如果想控制这样的样本结构,将需要花费更多时间来标定选择样本,所以实际应用时仍需要在分类精度和花费人力物力之间折衷选择。此外,样本量少时,T-PUL方法相对PUL方法的优势更为明显。可见,PUL方法、T-PUL方法在遥感影像的分类中有着广阔的应用前景。
﹀
|
外文摘要: |
In some situation, maybe only a few labeled data of one class are available, One-class classification means training the classifier on data of only one class, and then label the data to be classified with the classifier. In remote-sensing classification, there are situations when we can hardly or even not get samples of some classes, while traditionally, all land types in an image should be labeled to apply the classification methods. And for some applications, we may only be interested in a specific class without considering other land types. Then labeling samples of all classes occur in the image may increase the classification difficulty and cost for labeling training data. These problems can be referred to as one-class classification. In this article, we introduce and analysis some typical approaches of one-class classification, and then perform some experiment on TM remote sensed data. Experimental results show the OCSVM method and GDD method provide relatively good performance in remote sensing image classification, and the biased SVM (BSVM) method could get better results since combining both labeled and unlabeled data for classifier training. Experimental results also indicate that unlabeled samples also provide useful information for the construction of classifiers. Then we present a positive and unlabeled learning (PUL) algorithm that has good potential in one-class classification. This algorithm does not need labeled negative data in the training set and has shown promise in document classification. However, its application in remote sensing classification has not been studied widely. We propose a new algorithm named T-PUL based on a simple modification of the PUL algorithm. They were applied to classify data extracted from two scenes of TM images. Experimental results indicate that the PUL and proposed T-PUL algorithm provide higher accuracy then OCSVM and BSVM method. And T-PUL algorithm showed better results than PUL algorithm. The advantage of this algorithm is that it can significantly reduce the cost of labeling training data without losing accuracy.In view of the enormous potential of the applications on remote sensing classification of PUL and T-PUL algorithm, the rest of this article do further research on the PUL methods and T-PUL method through experiments to investigate the influence of the number of training samples, the composition category of unlabeled samples, and different target classes. In general, more target samples and unlabeled samples contain more negative samples result in higher accuracy. However, controlling the composition category of the training sets may increase the cost of labeling training data, so it would need a tradeoff between classification accuracy and cost of labeling data. In addition, when the number of training samples is small the difference between PUL and T-PUL algorithm is larger, which means the advantage of T-PUL is more obvious.
﹀
|
参考文献总数: | 54 |
作者简介: | 学术论文:[1] Dameng Yin, Xin Cao, Yijie Shao, Jin Chen, Automatic Thresholding Comparison for Snow Cover Mapping in Landsat TM/ETM+ Imagery, IEEE, Geoscience and Remote Sensing Letters,2011,( 待刊)[2] 邵一杰,殷大萌,陈晋,曹入尹. 基于TM影像的单类分类算法比较研究, 遥感信息(已投稿)[3] 邵一杰,殷大萌,陈晋. |
馆藏号: | 硕081405/1202 |
开放日期: | 2012-06-05 |