Improving Text Generation with Student-Forcing Optimal Transport

Conference on Empirical Methods in Natural Language Processing(2020)

引用 13|浏览1295
暂无评分
摘要
Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. We examine the necessity of adding Student-Forcing scheme during training with an imitation learning interpretation. An extension is further proposed to improve the OT learning for long sequences, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
更多
查看译文
关键词
text generation,transport,student-forcing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要