XTASK-eXTreme fine-grAined concurrent taSK invocation runtime

semanticscholar(2017)

引用 0|浏览1
暂无评分
摘要
Exascale computers are expected to be made of millions of nodes and billions of threads of execution. To support high degrees of parallelism for various applications, the threads and task scheduling needs to be fine-grained and should be able to execute in the order of tens to a few hundred CPU cycles. Overdecomposition of applications to fine-grained applications is ideal to achieve maximum speed up and there is a need for a parallel runtime system which can launch tasks for execution and report the results with very low latency at high levels of concurrency. This work aims at enabling the launch of independent tasks on many-core accelerator hardware architectures and mechanisms to support tasks of fine granularity on the order of tens of few hundreds of CPU cycles at a large scale. This work also focuses on analyzing the performance of various queue-based data structures commonly used in parallel programming languages and runtime systems. This analysis is essential for designing an efficient runtime system for scheduling billions of tasks with very low latency and high throughput. Lastly, the runtime would also support data dependencies and task dependencies required for task-based shared memory parallel programming.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要