TESS: Text-to-Text Self-Conditioned Simplex Diffusion
arXiv (Cornell University)(2023)
摘要
Diffusion models have emerged as a powerful paradigm for generation,
obtaining strong performance in various continuous domains. However, applying
continuous diffusion models to natural language remains challenging due to its
discrete nature and the need for a large number of diffusion steps to generate
text, making diffusion-based generation expensive. In this work, we propose
Text-to-text Self-conditioned Simplex Diffusion (TESS), a text diffusion model
that is fully non-autoregressive, employs a new form of self-conditioning, and
applies the diffusion process on the logit simplex space rather than the
learned embedding space. Through extensive experiments on natural language
understanding and generation tasks including summarization, text
simplification, paraphrase generation, and question generation, we demonstrate
that TESS outperforms state-of-the-art non-autoregressive models, requires
fewer diffusion steps with minimal drop in performance, and is competitive with
pretrained autoregressive sequence-to-sequence models. We publicly release our
codebase at https://github.com/allenai/tess-diffusion.
更多查看译文
关键词
diffusion,text-to-text,self-conditioned
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要