- 无标题文档
查看论文信息

中文题名:

 基于贝叶斯神经网络与知识嵌入的图像分类模型    

姓名:

 王顺钢    

保密级别:

 公开    

论文语种:

 chi    

学科代码:

 025200    

学科专业:

 应用统计    

学生类型:

 硕士    

学位:

 应用统计硕士    

学位类型:

 专业学位    

学位年度:

 2023    

校区:

 北京校区培养    

学院:

 统计学院    

研究方向:

 统计学习    

第一导师姓名:

 赵俊龙    

第一导师单位:

 统计学院    

提交日期:

 2023-06-18    

答辩日期:

 2023-05-12    

外文题名:

 Image Classification Model Based on Bayesian Neural Network and Knowledge Embedding    

中文关键词:

 标签嵌入 ; 空间相关卷积核 ; 贝叶斯神经网络    

外文关键词:

 Label embedding ; Spatial correlation convolution kernel ; Bayesian neural network    

中文摘要:

随着计算机视觉领域的发展与算法技术的落地,人们与人工智能产品的联系越来越紧密,特别是图像分类模型。然而,现实生活中存在着许多含噪声图片,并且这些噪声为模型带来一定的不确定性,使得模型可能会由于输入图片像素的微小扰动而输出错误的结果,进而降低人们对深度学习模型的信任度。因此,为了解决深度学习模型过度自信地输出推理结果以及对噪声数据极为敏感的问题,本文对含噪声的图像分类问题进行研究,提出知识嵌入置信分类模型。

知识嵌入置信分类模型主要分成三个部分,首先是构造空间相关卷积核算子,用于手工提取图像的空间相关性特征;其次是使用先进的标签嵌入算法,将分类标签映射到欧几里得空间中的一组向量,且不同标签之间的映射向量存在一定的联系;第三部分是使用贝叶斯统计方法为神经网络赋予不确定性的度量,将神经网络的可学习权重以服从正态分布的随机变量形式表示,进而构建贝叶斯卷积神经网络模型,并使用变分推理和重参数化技巧等方法对贝叶斯神经网络进行优化。此外,本文预先设定推理阈值来控制模型对输入图片进行结果输出的严格程度。

本文分别使用MNIST数据集、随机噪声图和字母图依次评估模型性能。本文提出的知识嵌入置信分类模型在推理阈值取0.8的情况下,对MNIST手写数字测试集的分类准确率、精确率、召回率、F1-score和AUC分别为0.9956、0.9845、0.9831、0.9838以及0.9936,相比于经典的卷积神经网络具有几乎相同预测精度。此外,本文提出的分类框架对噪声图片和分布外图片具有一定的拒绝推理能力,当推理阈值为0.9时,模型对随机噪声图的拒绝推理占比为99.03%,对字母图的拒绝推理占比为70.81%,对MNIST手写数字识别图拒绝推理占比为3.78%,且成功推理准确率为99.75%,表明本文提出的模型具备不确定性的度量。

本文提出的模型为研究带噪声的图像分类方法提供新的思路,在未来的工作中将尝试把模型框架扩展到更复杂的场景,以便更好的服务于日常的生产和生活。

外文摘要:

With the development of computer vision and the implementation of algorithm technology, people are getting more and more closely connected with artificial intelligence products, especially image classification models. However, there are many noisy pictures in real life, and these noises bring certain uncertainties to the model. The model may output wrong results because of small perturbations in input image pixels, reducing people's trust in deep learning models. Therefore, to solve the problem that the deep learning model outputs inference results with overconfidence and is extremely sensitive to noisy data, this paper studies the problem of image classification with noise and proposes a knowledge-embedding confidence classification model.

The knowledge-embedding confidence classification model comprises three parts. The first is to construct the spatial correlation convolution kernel operator to extract the spatial correlation features of the image. The second is to use the advanced label embedding algorithm to map the classification labels to a set of vectors in the Euclidean space, and there is a certain relationship between the mapping vectors. The third part is to use the Bayesian statistical method to give the neural network a measure of uncertainty, express the learnable weight of the neural network as a random variable that obeys a normal distribution, and then construct a Bayesian convolutional neural network model, and it is optimized with variational inference and reparameterization trick. In addition, this paper pre-sets the inference threshold to control the strictness of the model's output.

This article evaluates the performance of three models using the MNIST dataset, random noise images, and letter images, respectively. With an inference threshold is 0.8, the classification accuracy, precision, recall, F1-score, and AUC of the knowledge-embedding confidence classification model proposed in this paper are 0.9956, 0.9845, 0.9831, 0.9838, and 0.9936 respectively, which shows that the model proposed in this paper has almost the same prediction performance as the classical convolutional neural network. In addition, the classification framework proposed in this paper has a certain ability to reject predicting noise images and out-of-distribution images. When the inference threshold is 0.9, the model has a rejection rate of 99.03% for random noise images and 70.81% for letter images. And the model has a rejection rate of 3.78% for the MNIST dataset and the accuracy rate of successful inference samples is 99.75%, which shows that the model proposed in this paper can measure uncertainty, and also has a certain interpretability and robustness.

The model proposed in this paper provides new ideas for the study of noisy image classification methods. In future work, we will try to extend the model framework to more complex scenarios in order to better serve daily production and life.

参考文献总数:

 36    

作者简介:

 王顺钢,北京师范大学统计学院应用统计专业研究生,研究方向为统计学习模型。    

馆藏号:

 硕025200/23022    

开放日期:

 2024-06-18    

无标题文档

   建议浏览器: 谷歌 360请用极速模式,双核浏览器请用极速模式