中文题名: | 面向智能穿戴应用的联邦学习与数据发布隐私保护机制研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 081002 |
学科专业: | |
学生类型: | 硕士 |
学位: | 工学硕士 |
学位类型: | |
学位年度: | 2022 |
校区: | |
学院: | |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2022-06-15 |
答辩日期: | 2022-06-15 |
外文题名: | Research on Privacy Preserving Mechanism of Federated Learning and Data Publishing for Smart Wearable Applications |
中文关键词: | |
中文摘要: |
现今,人们对健康越发重视,智能穿戴设备作为一种能实时记录人体生理数据和健康状况的工具被更广泛地应用于日常生活中。近年来,关于数据安全与隐私保护的法律法规陆续颁布、实施,确保智能穿戴设备的数据安全已成为社会法律层面的要求。然而,由于智能穿戴数据具有分布性、实时性、多类型的特征,现有的隐私保护技术难以适用。如何在智能穿戴场景下进行隐私保护,是本文需要解决的问题。 本文着眼于智能穿戴场景的联邦学习和数据发布两个环节,对现有的隐私保护技术进行了深入的研究和改进。本文的研究内容分为以下几部分: 第一,对于联邦学习提出了基于双重补偿的异步联邦学习算法。本文使用分布式机器学习中基于二阶泰勒展开的DC-ASGD算法作为时延补偿,在目标函数中添加FedProx算子作为异构补偿,并基于隐式梯度下降进行更新。本文提出的方法融合了上述两种算法的优点,并形成一种松散、灵活的联邦学习模式。 第二,对于静态数据的发布提出个性化k-匿名算法。本文采用熵作为衡量数据离散度的标准,在传统V-MDAV方法的基础上对准标识符属性进行赋权以提高安全性和可用性,并对分类型数据和数值型数据进行了不同的处理。 第三,对于实时数据的发布提出时序差分隐私机制(Temporal Differential Privacy, TDP)。对于数据表更新过程中产生的隐私泄露问题,本文基于传统的差分隐私定义了TDP机制,使两时刻发布的数据落入相关值的概率之比满足一定的关系,同时提出了一种基于拉普拉斯机制的TDP实现方法,并进行了形式化的说明。 本文对于上述研究内容进行了仿真实验验证。在联邦学习中,本文提出的算法在常规场景和异构场景中的准确率普遍比现有的FedAsync算法高出2%左右。在静态数据发布中,对个性化k-匿名算法与传统V-MDAV算法进行了安全性和可用性的对比分析。在实时数据发布中,对TDP与传统差分隐私算法进行了时间序列复杂度的对比分析。结果表明,本文所提出的算法具有较好的安全性与可用性。
|
外文摘要: |
Nowadays, people pay more and more attention to health. As a tool that can record human physiological data and health status in real time, intelligent wearable devices have been widely used in daily life. In recent years, laws and regulations on data security and privacy preserving have been promulgated and implemented one after another. Ensuring the data security of intelligent wearable devices has become a social and legal requirement. However, due to the distributed, real-time and multi-type characteristics of intelligent wearable data, the existing privacy preserving technologies are difficult to apply. How to protect privacy in intelligent wearing scene is the problem to be solved in this paper. This paper focuses on the two aspects of federated learning and data release in intelligent wearing scene, and makes an in-depth research and improvement on the existing privacy preserving technology. The research content of this paper is divided into the following parts: Firstly, for federated learning, an asynchronous federated learning algorithm based on double compensation is proposed. In this paper, the DC-ASGD algorithm based on second-order Taylor expansion in distributed machine learning is used for time delay compensation, while fedprox operator is added to the loss function for heterogeneous compensation, and its update is based on implicit gradient descent. The scheme proposed in this paper combines the advantages of the above two methods and forms a loose and flexible federal learning model. Secondly, a personalized k-anonymity algorithm is proposed for the publishing of static data. In this paper, entropy is used to measure the dispersion extent of data, and the Quasi-identifier attributes are weighted to improve the security and availability based on the traditional V-MDAV method. Besides, the classified data and numerical data are processed differently. Thirdly, for the publishing of real-time data, the Temporal Differential Privacy (TDP) mechanism is proposed. For the problem of privacy disclosure in data updating, this paper defines the TDP mechanism based on the traditional differential privacy to make the ratio of the probability of the data published at two times falling into the relevant value meets a certain relationship. At the same time, a TDP implementation method based on Laplace mechanism is proposed and formally explained. In this paper, the above research contents are verified by simulation experiments. In federated learning, the accuracy of the proposed algorithm in conventional and heterogeneous scenes is generally about 2% higher than the existing FedAsync algorithm. In static data publishing, the security and usability of personalized k-anonymity algorithm and traditional V-MDAV algorithm are compared and analyzed. In the real-time data publishing, the time series complexity of TDP and traditional differential privacy algorithm is compared and analyzed. The results show that the proposed algorithms have good performances of security and availability. |
参考文献总数: | 56 |
馆藏号: | 硕081002/22008 |
开放日期: | 2023-06-15 |