Latency Hiding of Log-Depth Scan and Reduce Networks in Heterogenous Embedded Systems

2023 IEEE 29th International Symposium for Design and Technology in Electronic Packaging (SIITME)(2023)

引用 0|浏览1
暂无评分
摘要
This paper discusses methods, algorithmic examples and general principles regarding latency reduction methods for single chip Map-Reduce and Map-Scan many-core architectures. Processors designed for embedded systems suffer performance limitations (both performance and power consumption) when running intense instead of complex computations. A common solution is to add accelerators to the host processor in order to offload parts of the intense computations. We consider a Map-Scan-Reduce many-core architecture to be highly effective as a general-purpose accelerator and in this paper, we discuss the latencies introduced by the Scan and Reduce networks and ways in which to hide them based on practical applications and the solutions we have employed. Proper usage of pipelining technique and algorithmic improvements helps us obtain in simulations supralinear accelerations in relation to the number of processing cores used for the algorithms presented: matrixvector/matrix multiplication, FFT, pooling.
更多
查看译文
关键词
Latency avoidance,Map-Scan accelerator,MapReduce accelerator
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要