SimBNN: A Similarity-Aware Binarized Neural Network Acceleration Framework

2019 IEEE 27th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)(2019)

引用 5|浏览68
暂无评分
摘要
Binarized Neural Networks (BNNs) eliminate bitwidth redundancy in Convolutional Neural Networks (CNNs) by using a single bit (-1/+1) for network parameters and intermediate representations. This greatly reduces off-chip data transfer and storage overhead. However, considerable computation redundancy remains in BNN inference. To tackle this problem, we investigate the similarity property in input data and kernel weights. We identify an average of 79% input similarity and 61% kernel similarity measured by our proposed metric across common network architectures. Motivated by this observation, we propose SimBNN, a fast and energy-efficient acceleration framework for BNN inference that leverages similarity properties. SimBNN consists of a set of similarity-aware accelerators, a weight reuse optimization algorithm, and a similarity selection mechanism. SimBNN incorporates two types of BNN accelerators, which exploit the input similarity and kernel similarity, respectively. More specifically, the result from the previous stage is reused if similarity is identified, thus significantly reducing BNN computation overhead. Furthermore, we propose a weight reuse optimization algorithm, which increases the weight similarity by off-line re-ordering weight kernels. Finally, our framework provides a systematic method to determine the optimal strategy between input data and kernel weights reuse, based on the similarity characteristics of input data and pre-trained BNNs.
更多
查看译文
关键词
Kernel,Acceleration,Convolutional neural networks,Field programmable gate arrays,Hardware,Random access memory
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要