Memory Parallelism Using Custom Array Mapping To Heterogeneous Storage Storage Structures

FPL(2006)

引用 6|浏览20
暂无评分
摘要
Configurable architectures offer the unique opportunity of customizing the storage allocation to meet specific applications' needs. In this paper we describe a compiler approach to map the arrays of a loop-based computation to internal memories of a configurable architecture with the objective of minimizing the overall execution time. We present an algorithm that considers the data access patterns of the arrays along the critical path of the computation as well as the available storage and memory bandwidth. We demonstrate experimental results of the application of this approach for a set of kernel codes when targeting a Field-Programmable Gate-Array (FPGA). The results reveal that our algorithm outperforms naive and custom data layouts for these kernels by an average of 33% and 15% in terms of execution time, while taking into account the available hardware resources.
更多
查看译文
关键词
field programmable gate arrays,bandwidth,computer architecture,memory bandwidth,field programmable gate array,critical path,kernel,parallel processing,algorithm design and analysis,hardware,data access
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要