Quantization Aware Training with Absolute-Cosine Regularization for Automatic Speech Recognition.

INTERSPEECH(2020)

引用 28|浏览10
暂无评分
摘要
Compression and quantization is important to neural networks in general and Automatic Speech Recognition (ASR) systems in particular, especially when they operate in real-time on resource-constrained devices. By using fewer number of bits for the model weights, the model size becomes much smaller while inference time is reduced significantly, with the cost of degraded performance. Such degradation can be potentially addressed by the so-called quantization-aware training (QAT). Existing QATs mostly take into account the quantization in forward propagation, while ignoring the quantization loss in gradient calculation during back-propagation. In this work, we introduce a novel QAT scheme based on absolute-cosine regularization (ACosR), which enforces a prior, quantization-friendly distribution to the model weights. We apply this novel approach into ASR task assuming a recurrent neural network transducer (RNN-T) architecture. The results show that there is zero to little degradation between floating-point, 8-bit, and 6-bit ACosR models. Weight distributions further confirm that in-training weights are very close to quantization levels when ACosR is applied.
更多
查看译文
关键词
speech recognition, quantization-aware training (QAT), recurrent neural network transducer (RNN-T), regularization, absolute-cosine regularization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要