On The Importance Of Improving Cache Locality In Application-Specific Accelerators Via Hls

Yasin Alptekin,Ismail San

2020 28TH SIGNAL PROCESSING AND COMMUNICATIONS APPLICATIONS CONFERENCE (SIU)(2020)

引用 0|浏览3
暂无评分
摘要
Designing an hardware architecture using a low-level hardware description language (Verilog, VHDL) is a difficult and time-consuming task especially when the application is a complex and memory intensive one A high-level synthesis (ELS) tool, most recently and actively being researched in several research groups, automatically generates an RTL description of the hardware architecture from a high-level (C/C++) program. However, application to be accelerated on the hardware via an HLS tool needs to be written in order to decrease the overall memory access latency by simply rewriting the code so that the reformatted loop structure will have more locality. In this paper, we present two case studies to decrease the memory access latency by improving the locality of a given application by reorganizing the memory access pattern of the application being accelerated via HLS on hardware that has a cache. We also emphasize the importance of locality in performance of hardware accelerators with our empirical results on a Zynq-based SoC platform.
更多
查看译文
关键词
Locality, cache, domain-specific accelerator, system-on-chip, FPGA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要