A Data Layout Transformation (Dlt) Accelerator: Architectural Support For Data Movement Optimization In Accelerated-Centric Heterogeneous Systems

2016 Design, Automation & Test in Europe Conference & Exhibition (DATE)(2016)

引用 7|浏览11
暂无评分
摘要
Technology scaling and growing use of accelerators make optimization of data movement of increasing importance in all computing systems. Further, growing diversity in memory structures makes embedding such optimization in software non-portable. We propose a novel architectural solution called Data Layout Transformation (DLT) associated with a simple set of instructions that enable software to describe the required data movement compactly, and free the implementation to optimize the movement based on the knowledge of the memory hierarchy and system structure.The DLT architecture ideas can be applicable to both general-purpose and accelerator-based heterogeneous systems. Experiment results first show that the proposed DLT architecture can make use of the full bandwidth (>97%) of a wide range of memory systems (DDR3 and HMC) while its implementation cost is relatively low, occupying only 0.24 mm 2 and consuming 75mW at 1GHz in 32nm CMOS technology. Our evaluation of using the DLT accelerator in accelerated-based heterogeneous system across DDR3 and HMC memory shows that the DLT can enhance system performance in range of 4.6x-99x (DDR3), 4.4x-115x (HMC) which turns out 2.8x-48x (DDR3), 1.4x-39x (HMC) improvement for energy efficiency.
更多
查看译文
关键词
data layout transformation accelerator,DLT accelerator,architectural support,data movement optimization,accelerated-centric heterogeneous systems,technology scaling,computing systems,memory structures,memory hierarchy,system structure,DLT architecture,accelerator-based heterogeneous systems,memory systems,CMOS technology,accelerated-based heterogeneous system,HMC memory,DDR3 memory,energy efficiency
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要