Towards High Performance Lvcsr In Speech-To-Speech Translation System On Smart Phones

12TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2011 (INTERSPEECH 2011), VOLS 1-5(2011)

引用 25|浏览63
暂无评分
摘要
This paper presents the endeavors to improve the performance of large vocabulary continuous speech recognition (LVCSR) in speech-to-speech translation system on smart phones. A variety of techniques towards high LVCSR performance are investigated to achieve high accuracy and low latency given constrained resources. This includes one-pass streaming mode decoding for minimum latency, acoustic modeling with full-covariance based on bootstrap and model restructuring for improving recognition accuracy with limited training data; quantized discriminative feature space transforms and quantized Gaussian mixture model to reduce memory usage with negligible degradation on recognition accuracy. Some speed optimization methods are also discussed to increase the recognition speed. The proposed techniques evaluated on the DARPA Transtac datasets will be shown to give good overall performance under the constraints of both CPU and memory on smart phones.
更多
查看译文
关键词
speech recognition, speech-to-speech translation, one-pass streaming, bootstrap, quantized discriminative feature space transforms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要