Parallelizing Wfst Speech Decoders

Charith Mendis,Jasha Droppo,Saeed Maleki, Madanlal Musuvathi,Todd Mytkowicz, Geoffrey Zweig

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)

引用 18|浏览236
暂无评分
摘要
The performance-intensive part of a large-vocabulary continuous speech-recognition system is the Viterbi computation that determines the sequence of words that are most likely to generate the acoustic-state scores extracted from an input utterance. This paper presents an efficient parallel algorithm for Viterbi. The key idea is to partition the per-frame computation among threads to minimize inter-thread communication despite traversing a large irregular acoustic and language model graphs. Together with a per-thread beam search, load balancing language-model lookups, and memory optimizations, we achieve a 6.67x speedup over an highly-optimized production-quality WFST-based speech decoder. On a 200,000 word vocabulary and a 59 million ngram model, our decoder runs at 0.27x real time while achieving a word-error rate of 14.81% on 6214 labeled utterances from Voice Search data.
更多
查看译文
关键词
Parallel Viterbi,WFST Decoder,Large vocabulary
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要