Multi-Scale Dynamic Fixed-Point Quantization and Training for Deep Neural Networks.

Po-Yuan Chen, Hung-Che Lin,Jiun-In Guo

ISCAS(2023)

引用 0|浏览1
暂无评分
摘要
State-of-the-art deep neural networks often require extremely high computational power which results in the deployment of deep neural networks on embedded devices being impractical. Therefore, model quantization is important for the deployment of deep neural networks on edge devices. The purpose of this paper is to quantize the deep neural networks from high-precision to low-precision (e.g. INT8) dynamic fixed-point format at the layer-by- layer level quantization. In addition, we further improve the uniform dynamic fixed-point quantization to multi-scale dynamic fixed-point quantization for lower quantization loss. The proposed multi-scale dynamic fixed-point quantization scheme divides the quantization ranges into two regions, and each region is assigned different quantization levels and quantization parameters to better approximate the bell-shaped distributions. The proposed quantization pipeline is composed of post-training quantization followed by model fine-tuning which can keep the accuracy drop of the quantized model within 1% mean average precision (mAP). Furthermore, the proposed quantization and fine-tuning method can be combined with model pruning to obtain a compact and accurate deep neural network with low bit-width.
更多
查看译文
关键词
Deep Learning, Model Quantization, Object Detection, Dynamic Fixed-Point Quantization, Multi-Scale Dynamic Fixed-Point Quantization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要