Software-based Dynamic Overlays Require Fast, Fine-grained Partial Reconfiguration

Proceedings of the 10th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies(2019)

引用 1|浏览15
暂无评分
摘要
In this paper, we consider dynamic overlays which use fine-grained partial reconfiguration (PR) to continuously adapt to their software-based workload. In particular, we show how to modify a traditional (static) overlay developed for OpenVX into a dynamic overlay. We use a Xilinx FPGA, and show that the dynamic overlay needs unsupported features including faster PR, relocatability, and fine-grained configuration is needed for performance. Since these features are not available in Xilinx FPGAs, we estimate the application-level speedup they would provide. We find that vector custom instruction (VCI) chaining, which allow a VCI to directly cascade its result into another VCI is also essential. Overall, we find the static overlay achieves a speedup of roughly 20x faster than a Cortex-A9 processor, but with improved PR and chaining a speedup of 106x is attainable. While there have been calls for fast, fine-grained PR devices for decades, we believe that dynamic overlays may be the first true "killer application" that will justify adding these features to all FPGA devices.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要