WinoTrain: Winograd-Aware Training for Accurate Full 8-bit Convolution Acceleration.

DAC(2023)

引用 0|浏览2
暂无评分
摘要
Efficient inference is critical in realizing a low-power, real-time implementation of convolutional neural networks (CNNs) on compute and memory-constrained embedded platforms. Using quantization techniques and fast convolutional algorithms like Winograd, CNN inference can achieve benefits in latency and in energy consumption. Performing Winograd convolution involves (1) transforming the weights and activations to the Winograd domain, (2) performing element-wise multiplication on the transformed tensors, and (3) transforming the results back to the conventional spatial domain. Combining Winograd with quantization of all its steps results in severe accuracy degradation due to numerical instability. In this paper we propose a simple quantization-aware training technique, which quantizes all three steps of the Winograd convolution, while using a minimal number of scaling factors. Additionally, we propose an FPGA accelerator employing tiling and unrolling methods to highlight the performance benefits of using the full 8-bit quantized Winograd algorithm. We achieve 2x reduction in inference time compared to standard convolution on ResNet-18 for the ImageNet dataset, while improving the Top-1 accuracy by 55.7 p.p. compared to a standard post-training quantized Winograd variant of the network.
更多
查看译文
关键词
accurate full 8-bit convolution acceleration,CNN inference,conventional spatial domain,convolutional neural networks,energy consumption,fast convolutional algorithms,FPGA accelerator employing tiling,inference time,memory-constrained embedded platforms,Performing Winograd convolution,quantization techniques,real-time implementation,severe accuracy degradation,simple quantization-aware training technique,standard convolution,standard post-training quantized Winograd variant,steps results,transformed tensors,Winograd algorithm,Winograd domain,Winograd-aware training,WinoTrain
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要