Parametrization of Algorithms and FPGA Accelerators To Predict Performance

msra(2007)

引用 28|浏览5
暂无评分
摘要
This paper presents a scheme for separately characterizing computational algorithms and characterizing computing hardware, and then combining those analyses to find the suitability of a piece of hardware for a scientific algorithm. The analysis of the algorithm concentrates on a continuous computational density function, ρ, that characterizes the loss of efficiency of computation as a function of local store size. A hardware system has multiple layers of cache and data communication, each with a measured bandwidth, latency, and cache size. To predict a limit of the performance of an algorithm on a piece of hardware, each layer is combined with the algorithm’s computational density function to compute the limit that layer places on the calculation speed. The lowest calculation speed is then the upper limit of the computation of the algorithm on that hardware platform.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要