中文题名: | 句子自动生成及受控模型研究 |
姓名: | |
保密级别: | 公开 |
论文语种: | 中文 |
学科代码: | 081203 |
学科专业: | |
学生类型: | 博士 |
学位: | 工学博士 |
学位类型: | |
学位年度: | 2020 |
校区: | |
学院: | |
研究方向: | 自然语言处理 |
第一导师姓名: | |
第一导师单位: | |
提交日期: | 2020-06-19 |
答辩日期: | 2020-06-05 |
外文题名: | RESEARCH ON MODELS OF AUTOMATIC SENTENCE GENERATION AND CONTROLLED GENERATION |
中文关键词: | |
外文关键词: | Natural language generation ; Sentence automatic generation ; Controlled sentence automatic generation ; Recurrent neural network ; Variational autoencoder ; Convolutional neural network |
中文摘要: |
自然语言生成是自然语言处理和人工智能领域的核心任务和热点问题,是具有广泛应用前景的前沿技术。基于人工神经网络的深度学习是人工智能领域新一轮爆发式发展的关键,已迅速渗透到越来越多的人工智能任务中。在自然语言生成的研究中,基于人工神经网络的深度学习方法也得到了非常广泛的关注和助力推动。
句子自动生成是许多自然语言生成任务中的关键步骤,受控的句子自动生成在许多自然语言生成任务中发挥着重要的作用。本文以句子的自动生成为研究问题,以基于人工神经网络的句子自动生成模型为研究对象,在研究与实验过程中,提出了新的句子自动生成模型和受控的句子自动生成模型。具体来说,本文的创新点及贡献主要包括:
(1)提出了基于循环神经网络的词受控的句子自动生成模型Coupled-RNN(Coupled Recurrent Neural Network)。循环神经网络语言模型是自然语言生成任务中最常用的基础模型,然而,该模型只能控制位于生成的句子中起始位置的一个或几个词,无法对句子中其他位置的词施加约束,更无法控制某个词在句子中出现的精确位置。Coupled-RNN以循环神经网络语言模型为基础,使用两个循环神经网络,从受控词开始分别向后和向前生成句子。为了解决位置信息丢失的问题,Coupled-RNN引入了位置编码机制。为了解决误差无法反向传播的问题,Coupled-RNN引入了隐状态耦合机制和加权输出耦合机制。实验表明Coupled-RNN可以生成在特定的位置包含受控词的句子;所提出的位置编码机制和两种耦合机制有效。
(2)提出了基于变分自动编码器的句子自动生成模型LSE-VAE(Latent Space Expanded Variational AutoEncoder)。Coupled-RNN中采用的循环神经网络语言模型由于缺少句子的全局信息表示,生成的句子的质量和多样性会受到影响。自动编码器和变分自动编码器均引入了句子的全局信息表示向量,然而,自动编码器将每个输入句子编码为一个确定的、孤立的潜在向量,并不适用于句子自动生成任务,变分自动编码器具有连续的潜在向量空间,但是存在句子重构性差、有效潜在向量空间有限、句子多样性差、模型较难训练等问题。LSE-VAE以变分自动编码器为基础,通过为不同的句子设置不同的潜在向量先验概率分布,并根据句子相似度对潜在向量空间进行排布,学习到了一个更大的、信息量更丰富的潜在向量空间。实验表明LSE-VAE具有更好的句子重构性,生成的句子质量更高、多样性更好;可以学习到和变分自动编码器一样连续平滑的潜在向量空间,且能更好地区分不同相似度的句子;不需要KL Cost Annealing或者其他的工程性技巧,更容易训练,且超参数可根据潜在向量的维度和模型需求用分析解析的方式得出。
(3)提出了基于LSE-VAE的词和主题受控的句子自动生成模型TW_LSE-VAE(Topic and Word constrained LSE-VAE)。TW_LSE-VAE以LSE-VAE为基础构建语言模型,并结合Coupled-RNN修改解码器结构,同时为了控制生成的句子的主题,通过卷积神经网络构建主题模型对句子的主题信息进行建模,将主题模型输出的主题向量用于LSE-VAE中句子相似度的度量和句子的解码。通过语言模型和主题模型的联合训练,生成的句子能包含受控词,且能反映特定的主题分布信息。实验表明TW_LSE-VAE能够生成高质量、多样化的句子;主题模型学习到的主题的一致性相较基线模型更好;生成的句子除了在特定位置包含受控词之外,还可反映通过主题分布向量指定的一个或多个主题信息。
|
外文摘要: |
Natural language generation is the core task and hot issue in the field of natural language processing and artificial intelligence. Deep learning based on artificial neural network is the key to the new round of explosive development for artificial intelligence. In the research of natural language generation, the deep learning method based on artificial neural network has been widely concerned and promoted.
Automatic sentence generation is a key step in many natural language generation tasks. Controlled automatic sentence generation plays an important role in many natural language generation tasks. This paper takes the automatic generation of sentences as a research problem and the automatic generation models of sentences based on artificial neural network as the research object. Specifically, the innovations and contributions of this paper mainly include:
(1) This paper proposes a model named Coupled-RNN (Coupled Recurrent Neural Network) for word controlled automatic sentence generation, based on recurrent neural network. Recurrent neural network language model is the most commonly used basic model in natural language generation tasks. However, this model can only control one or several words at the beginning of the generated sentences, and cannot impose constraints on other words in the sentences, let alone control the exact positions of words in the sentences. Coupled-RNN, based on the recurrent neural network language model, uses two recurrent neural networks to generate sentences backwards and forwards starting from the constrained word. To solve the problem of position information missing, Coupled-RNN introduces a position embedding mechanism. To solve the problem that gradients cannot be back-propagated, Coupled-RNN introduces a hidden state coupling mechanism and a weighted output coupling mechanism. Experiments show that Coupled-RNN can generate sentences containing the constrained word in the specific position, and that the proposed position embedding mechanism and the two coupling mechanisms are effective.
(2) This paper proposes a model named LSE-VAE (Latent Space Expanded Variational AutoEncoder) for automatic sentence generation, based on variational autoencoder. In the recurrent neural network language model used in Coupled RNN, the quality and diversity of the generated sentences are affected by the lack of global information representations. Automatic encoder and variational autoencoder both have the global information representation vectors of the sentences. However, the automatic encoder encodes each input sentence as a definite and isolated latent vector, which makes it unsuitable for the automatic sentence generation tasks. The variational autoencoder has a continuous latent space, but there are some problems such as weak sentence reconstruction ability, limited effective and informative latent space, poor sentence diversity and difficult model training. Based on variational autoencoder, LSE-VAE learns a larger and more informative latent space by setting different prior probability distributions of latent vectors for different sentences and by arranging the latent space according to sentence similarity. Experiments show that LSE-VAE has better sentence reconstruction ability and generates high quality and more diverse sentences, that its latent space is as continuous and smooth as that of the variational autoencoder and the sentences with different similarity can be distinguished better, and that it is more easy to train, because it does not need the KL Cost Annealing or other engineering tricks and the hyperparameter can be derived in an analytical manner according to the latent vector size and modeling requirements.
(3) This paper proposes a model named TW_LSE-VAE (Topic and Word constrained LSE-VAE) for word and topic controlled automatic sentence generation, based on LSE-VAE. TW_LSE-VAE builds the language model on the basis of LSE-VAE and modifies the decoder structure in combination with Coupled-RNN. Meanwhile, in order to control the topic of the generated sentences, a topic model is constructed by convolutional neural network to model the topic information of the sentences. The topic vectors outputted from the topic model are then used to measure the similarity of the sentences and decode the sentences in LSE-VAE. Through the joint training of the language model and the topic model, the generated sentences can contain the constrained words and can reflect the specific topic distribution information. Experiments show that TW_LSE-VAE can generate high quality and diverse sentences, that the consistency of topics learned by the topic model is better than that of the baseline model, and that the generated sentences can not only contain the constrained words in specific positions, but also can reflect one or more topics specified by the topic distribution vectors.
|
参考文献总数: | 169 |
馆藏号: | 博081203/20002 |
开放日期: | 2021-06-19 |