Performance Evaluation Of Scale-Free Graph Algorithms In Low Latency Non-Volatile Memory

2017 IEEE INTERNATIONAL PARALLEL AND DISTRIBUTED PROCESSING SYMPOSIUM WORKSHOPS (IPDPSW)(2017)

引用 9|浏览94
暂无评分
摘要
The purpose of this study is to quantitatively assess the performance of graph processing algorithms for large scalefree graphs residing in byte-addressable Non-Volatile Memory (NVM). Our study focuses on static and dynamic graph algorithms previously optimized for external memory in the form of locally attached NAND Flash arrays, with data structures tuned to maximize locality. The evaluation is run on a unique resource, an NVM hardware emulator from Intel capable of inserting delays to memory reads through microcode instructions that delay load instructions missing in L3; the emulated NVM appears as separate UMA node (identified by a physical address range) and that is not part of the socket-attached NUMA nodes. In this work, we distinguish two graph processing configurations, " semi-external" in which the graph is fully resident in NVM but in-flight intermediate data structures reside in DRAM, and " fully external" in which both the graph and the intermediate data structures reside in NVM. Our goal is to assess the performance impact of NVM latency of up to 3.5X DRAM, with (semi-external) and without (fully external) an application-specific scratchpad for the in-flight data structures. We find a performance penalty of 59.6% in the fully external scenario, which is reduced to 5.2% with the scratchpad. Our results show that graph algorithms employing locality aware data structure layout and processing can benefit immediately from emerging NVMs with minimal performance impact, making NVM a high value resource for large scale graph processing.
更多
查看译文
关键词
Non-Volatile Memory,Performance Evaluation,Scale-Free Graph Algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要