A Plugin Architecture for the TAU Performance System

ICPP '19: Proceedings of the 48th International Conference on Parallel Processing(2019)

引用 7|浏览72
暂无评分
摘要
Several robust performance systems have been created for parallel machines with the ability to observe diverse aspects of application execution on different hardware platforms. All of these are designed with the objective to support measurement methods that are efficient, portable, and scalable. For these reasons, the performance measurement infrastructure is tightly embedded with the application code and runtime execution environment. As parallel software and systems evolve, especially towards more heterogeneous, asynchronous, and dynamic operation, it is expected that the requirements for performance observation and awareness will change. For instance, heterogeneous machines introduce new types of performance data to capture and performance behaviors to characterize. Furthermore, there is a growing interest in interacting with the performance infrastructure for in situ analytics and policy-based control. The problem is that an existing performance system architecture could be constrained in its ability to evolve to meet these new requirements. The paper reports our research efforts to address this concern in the context of the TAU Performance System. In particular, we consider the use of a powerful plugin model to both capture existing capabilities in TAU and to extend its functionality in ways it was not necessarily conceived originally. The TAU plugin architecture supports three types of plugin paradigms: EVENT, TRIGGER, and AGENT. We demonstrate how each operates under several different scenarios. Results from larger-scale experiments are shown to highlight the fact that efficiency and robustness can be maintained, while new flexibility and programmability can be offered that leverages the power of the core TAU system while allowing significant and compelling extensions to be realized.
更多
查看译文
关键词
measurement, parallel performance, plugin, runtime analytics
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要