An Energy-Efficient YOLO Accelerator Optimizing Filter Switching Activity.

ISCAS(2022)

引用 0|浏览1
暂无评分
摘要
Convolutional neural network (CNN) based object detectors such as the you-only-look-once (YOLO) achieve remarkable performance but come with high computing complexity and a large memory bandwidth. Therefore, it is challenging to design an accelerator for such object detectors on edge devices, which have a limited power budget and relatively small on-chip memory footprint. In this paper, we propose an energy- and memory-efficient CNN accelerator for YOLO. First, we propose a novel dataflow which reduces the amount of filter switching by 99.56% on average. Second, we propose a layer-wise on-chip memory reuse scheme in which multi-bank on-chip buffers are efficiently utilized for both feature maps (FMs) and filters without the need to access external memory for FMs. The proposed design is implemented on a Xilinx ZC706 FPGA and consumes power of 5.05W but achieves a throughput rate of 370.5 GOPS for Tiny-YOLOv2 while using only 640 DSPs and 322.5 BRAMs. Our design achieves energy efficiency of 73.39 GOPS/W, thus outperforming those in previous works.
更多
查看译文
关键词
YOLO,FPGA,on-chip memory,energy
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要