Chrome Extension
WeChat Mini Program
Use on ChatGLM

Exploiting Variable Precision Computation Array for Scalable Neural Network Accelerators

2020 2nd IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)(2020)

Cited 0|Views82
No score
Abstract
In this paper, we present a flexible Variable Precision Computation Array (VPCA) component for different accelerators, which leverages a sparsification scheme for activations and a low bits serial-parallel combination computation unit for improving the efficiency and resiliency of accelerators. The VPCA can dynamically decompose the width of activation/weights (from 32bit to 3bit in different accelerators) into 2-bits serial computation units while the 2bits computing units can be combined in parallel computing for high throughput. We propose an on-the-fly compressing and calculating strategy SLE-CLC (single lane encoding, cross lane calculation), which could further improve performance of 2-bit parallel computing. The experiments results on image classification datasets show VPCA can outperforms DaDianNao, Stripes, Loom-2bit by 4.67×, 2.42×, 1.52× without other overhead on convolution layers.
More
Translated text
Key words
Deep Neural Networks,Accelerator,Energy Efficiency Computing Array,Dynamic Quantization,Resiliency
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined