Unified Virtual Memory Support for Deep CNN Accelerator on SoC FPGA.

ICA3PP(2015)

引用 26|浏览17
暂无评分
摘要
Cooperation of CPU and hardware accelerator on SoC FPGA to accomplish computational intensive tasks, provides significant advantages in performance and energy efficiency. However, current operating systems provide little support for accelerators: the OS is unaware that a computational task can be executed either on a CPU core or an accelerator, and provides no assistance in efficient management of data sharing between CPU and accelerator on the DRAM, such as zero copy, data coherence. It’s also hard for current OS to allocate large contiguous physical memory space for accelerator. In this paper, we select the Xilinx ZYNQ as target and qualitatively analyze methods of sharing data. Besides using high-performance (HP) AXI interfaces of the ZYQN device, we develop a novel memory management system for FPGA-based accelerator. It provides a unified virtual space for CPU cores and accelerator so that they can access the same memory space in the operating systems user space. For a deep convolutional neural network task, our design gains up to speed-up of 5.34x compared to traditional processor-accelerator cooperation.
更多
查看译文
关键词
Unified virtual memory, Coherence, SoC, Deep CNN, Accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要