Algorithm/architecture solutions to improve beyond uniform quantization in embedded DNN accelerators

Ardavan Pedram,Ali Shafie Ardestani,Ling Li,Hamzah Abdelaziz,Jun Fang,Joseph Hassoun

Journal of Systems Architecture（2022）

引用 0|浏览28

暂无评分

摘要

The choice of data type has a major impact on speed, accuracy, and power consumption of deep learning accelerators. Quantizing the weights and activations of neural networks to integer based computation is an industry standard for reducing memory footprint and computation cost of inference in embedded systems. Uniform weight quantization can be used for tasks where accuracy drop can be tolerated. However, the drop in accuracy due to a uniform quantization might be non-negligible especially when performed on shallow networks, complex computer vision tasks, or with lower-bit integers.

查看译文

关键词

Quantization,Deep learning,Mixed precision,Accelerator

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要