Response Timing Estimation for Spoken Dialog Systems Based on Syntactic Completeness Prediction

2022 IEEE Spoken Language Technology Workshop (SLT)(2023)

引用 0|浏览4
暂无评分
摘要
Appropriate response timing is very important for achieving smooth dialog progression. Conventionally, prosodic, temporal and linguistic features have been used to determine timing. In addition to the conventional parameters, we propose to utilize the syntactic completeness after a certain time, which represents whether the other party is about to finish speaking. We generate the next token sequence from intermediate speech recognition results using a language model and obtain the probability of the end of utterance appearing $K$ tokens ahead, where $K$ varies from 1 to $M$ . We obtain an $M$ -dimensional vector, which we denote as estimates of syntactic completeness (ESC). We evaluated this method on a simulated dialog database of a restaurant information center. The results confirmed that considering ESC improves the performance of response timing estimation, especially the accuracy in quick responses, compared with the method using only conventional features.
更多
查看译文
关键词
Spoken Dialog System,Turn-taking,Response Timing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要