Lightweight monitoring of the progress of remotely executing computations

LANGUAGES AND COMPILERS FOR PARALLEL COMPUTING(2006)

引用 8|浏览0
暂无评分
摘要
The increased popularity of grid systems and cycle sharing across organizations requires scalable systems that provide facilities to locate resources, to be fair in the use of those resources, and to monitor jobs executing on remote systems. This paper presents a novel and lightweight approach to monitoring the progress and correctness of a parallel computation on a remote, and potentially fraudulent, host system. We describe a monitoring system that uses a sequence of program counter values to monitor program progress, and compiler techniques that automatically generate the monitoring code. This approach improves on earlier work by omitting the need to duplicate computation, which both simplifies and reduces the overhead of monitoring. Our approach allows dynamic and accountable cycle-sharing across the Internet. Experimental results show that the overhead of our system is negligible and our monitoring approach is scalable.
更多
查看译文
关键词
monitoring code,program counter value,monitoring system,lightweight monitoring,parallel computation,remote system,grid system,scalable system,host system,monitoring approach,lightweight approach,program counter,parallel computer
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要