An Analytical Study of Recursive Tree Traversal Patterns on Multi- and Many-Core Platforms

2017 IEEE 23rd International Conference on Parallel and Distributed Systems (ICPADS)(2017)

引用 2|浏览17
暂无评分
摘要
Recursive tree traversals are found in many application domains, such as data mining, graphics, machine learning and scientific simulations. In the past few years there has been growing interest in the deployment of applications based on graph data structures on many-core devices. A couple of recent efforts have focused on optimizing the execution of multiple serial tree traversals on GPU, and have reported performance trends that vary across algorithms. In this work, we aim to understand how to select the implementation and platform that is most suited to a given tree traversal algorithm and dataset. To this end, we perform a systematic study of recursive tree traversal on CPU, GPU and the Intel Phi processor. We first identify four tree traversal patterns: three of them performing multiple serial traversals concurrently, and the last one performing a single parallel level order traversal. For each of these patterns, we consider different code variants including existing and new optimization methods, and we characterize their control-flow and memory access patterns. We implement these code variants and evaluate them on CPU, GPU and Intel Phi. Our analysis shows that there is not a single code variant and platform that achieves the best performance on all tree traversal patterns, and it provides guidelines on the selection of the implementation most suited to a given tree traversal pattern and input dataset.
更多
查看译文
关键词
recursive tree traversal,many-core processor,parallelism,GPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要