Low-Latency Lightweight Streaming Speech Recognition With 8-Bit Quantized Simple Gated Convolutional Neural Networks

2020 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING(2020)

引用 3|浏览21
暂无评分
摘要
Automatic speech recognition (ASR) is very important for mobile devices. However, deep neural network-based ASR demands a large number of computations, while the memory bandwidth and battery capacity of mobile devices are limited. Server-based implementations are mostly employed, but this increases latency or privacy concerns. Efficient on-device ASR is the solution for these issues. In this paper, we propose a low-latency on-device speech recognition system with a simple gated convolutional network (SGCN). The SGCN shows a competitive recognition accuracy even with 1M parameters. In addition, SGCN is advantageous for parallelization which enables efficient cache utilization. 8-bit quantization is applied to reduce the memory size and computation time. The proposed system features online recognition fulfilling the 0.4s latency limit and operates with the real-time factor of 0.2 using only a single 900MHz CPU core. The system occupying 1.2MB memory footprint shows 19.75% word error rate (WER) with greedy decoding.
更多
查看译文
关键词
On-device speech recognition, Convolutional neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要