Understanding The Performance Of Stencil Computations On Intel'S Xeon Phi

2013 IEEE INTERNATIONAL CONFERENCE ON CLUSTER COMPUTING (CLUSTER)(2013)

引用 18|浏览52
暂无评分
摘要
Accelerators are becoming prevalent in high performance computing as a way of achieving increased computational capacity within a smaller power budget. Effectively utilizing the raw compute capacity made available by these systems, however, remains a challenge because it can require a substantial investment of programmer time to port and optimize code to effectively use novel accelerator hardware. In this paper we present a methodology for isolating and modeling the performance of common performance-critical patterns of code (so-called idioms) and other relevant behavioral characteristics from large scale HPC applications which are likely to perform favorably on Intel Xeon Phi. The benefits of the methodology are twofold: (1) it directs programmer efforts toward the regions of code most likely to benefit from porting to the Xeon Phi and (2) provides speedup estimates for porting those regions of code. We then apply the methodology to the stencil idiom, showing performance improvements of up to a factor of 4.7x on stencil-based benchmark codes.
更多
查看译文
关键词
parallel processing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要