Algorithm/architecture solutions to improve beyond uniform quantization in embedded DNN accelerators

Journal of Systems Architecture(2022)

引用 0|浏览28
暂无评分
摘要
The choice of data type has a major impact on speed, accuracy, and power consumption of deep learning accelerators. Quantizing the weights and activations of neural networks to integer based computation is an industry standard for reducing memory footprint and computation cost of inference in embedded systems. Uniform weight quantization can be used for tasks where accuracy drop can be tolerated. However, the drop in accuracy due to a uniform quantization might be non-negligible especially when performed on shallow networks, complex computer vision tasks, or with lower-bit integers.
更多
查看译文
关键词
Quantization,Deep learning,Mixed precision,Accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要