A generalized decoding method for neural text generation.

Comput. Speech Lang.(2023)

引用 0|浏览11
暂无评分
摘要
In natural language generation, most decoding methods are not intrinsic because their performance depends on extrinsically configured hyperparameters. It means that: first, the generation system is dynamic under different conditions while the decoding system is always static under any conditions once its hyperparameters are extrinsically fixed; second, it is hard to select a constant decoding hyperparameter that is omnipotent for all conditions. Although there are decoding methods that are hyperparameter-free, such as greedy and plain sampling, it has been well studied that these methods generally perform worse than methods with hyperparameters, such as beam search, top-k and top-p. Decoding with hyperparameters can get infinite strategies from different fixed configurations, while hyperparameter-free methods have only one strategy. Therefore, the comparison between them is actually unfair, which is a one-vs-infinite battle. So how to deal with the decoding hyperparameters properly and intrinsically? Is it true that hyperparameter-free methods are always inferior to methods with inexhaustible hyperparameter configurations? Is it possible to design a generalized framework, by which these decoding methods can be naturally connected, uniformly described, and mutually inspired? In these paper, we try to find answers to these questions.
更多
查看译文
关键词
neural text generation,generalized
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要