A 16.41 TOPS/W CNN Accelerator with Event-Based Layer Fusion for Real-Time Inference

Jiawei Wang, Li Lun,Zhenhui Dai, Yuanyuan Jiang,Xiaoxin Cui

2024 IEEE International Symposium on Circuits and Systems (ISCAS)(2024)

引用 0|浏览6
暂无评分
摘要
This paper proposes a convolutional neural network (CNN) accelerator architecture for real-time tasks in edge devices. An event-based layer fusion technique is adopted to eliminate on-chip storage requirements and off-chip data movement caused by features. Cross-layer pipeline is elaborated during layer fusion to obtain high throughput and low latency. An adaptive fully unrolling event-driven core is designed and a cyclic storage method is exploited to reduce the storage space for partial sum in the core. Modified LeNet is accelerated with the proposed architecture. The accelerator can reach an energy efficiency of 16.41 TOPS/W and a latency of 0.85μs under TSMC 28nm technology, and a frame rate of 369.4K FPS under FPGA.
更多
查看译文
关键词
CNN accelerator,layer fusion,event-driven,real time
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要