中文题名: | 基于语言模型的实体对齐研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | chi |
学科代码: | 081203 |
学科专业: | |
学生类型: | 硕士 |
学位: | 工学硕士 |
学位类型: | |
学位年度: | 2024 |
校区: | |
学院: | |
研究方向: | 知识图谱,自然语言处理 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2024-06-20 |
答辩日期: | 2024-05-29 |
外文题名: | Research on Entity Alignment Based on Language Models |
中文关键词: | |
外文关键词: | Entity alignment ; Knowledge graph ; Large language models ; Natural language processing |
中文摘要: |
实体对齐是知识融合领域的关键任务,其目的是识别并将多源异构知识图谱中的实体信息进行对齐和融合。基于TransE、图神经网络和语言模型的方法通常假设图谱之间拥有相似的拓扑结构和文本信息,通过学习图谱中结构和文本的嵌入表示来识别对应实体,依赖于结构信息和文本信息的一致性来达成对齐。当面对拓扑结构和文本信息显著异构的图谱时,这一假设往往不再成立,导致模型效果大幅下降。大规模语言模型如GPT3和LLaMA具有强大语义理解能力和推理能力,为处理实体对齐中的复杂的异构问题提供了独特的解决方案。本文结合大语言模型的优势,提出了一系列的方法来解决这些问题。 |
外文摘要: |
Entity alignment is a critical task in the field of knowledge fusion, aimed at identifying and aligning entity information across multi-source heterogeneous knowledge graphs. Methods based on TransE, graph neural networks, and language models typically assume similar topological structures and textual information across graphs. By learning embeddings of structures and texts within these graphs, they align corresponding entities, relying on the consistency of structural and textual information. However, this assumption often fails when facing graphs with significantly heterogeneous topologies and textual information, leading to substantial performance degradation. Large language models like GPT-3 and LLaMA, with their robust semantic understanding and reasoning capabilities, offer unique solutions to complex heterogeneity issues in entity alignment. This paper leverages the advantages of large language models to propose a series of methods addressing these challenges. |
参考文献总数: | 96 |
馆藏号: | 硕081203/24004 |
开放日期: | 2025-06-20 |