BulkCompactor: Optimized deterministic execution via Conflict-Aware commit of atomic blocks

High Performance Computer Architecture(2012)

引用 6|浏览0
暂无评分
摘要
Recent proposals for determinism-enforcement architectures are able to honor the dependences between threads through a commit step that often becomes a performance bottleneck. As they commit code blocks (or chunks) in a round-robin order, if one chunk gets squashed due to a conflict, its successors also observe a stall. We call this effect transitive squash delay. This paper proposes a novel, high-performance approach to deterministic execution based on Conflict-Aware commit. Rather than committing chunks in strict round-robin order, the idea is to skip those chunks with conflicts and deterministically execute them slightly later. The scheme, called BulkCompactor, largely eliminates transitive squash delay, “compacts” the chunk commits, and substantially speeds-up execution. With BulkCompactor, the squash overhead is O(N) rather than O(N2) as in round-robin. We describe BulkCompactor designs for machines with centralized or distributed commit. Finally, a simulation-based evaluation shows that BulkCompactor delivers performance comparable to nondeter-ministic systems. For example, for 32 processors, BulkCompactor incurs an average execution overhead of 22% over a nondetermin-istic system. The round-robin scheme's average overhead is 133%.
更多
查看译文
关键词
bulkcompactor,high-performance approach,deterministic execution optimization,squash overhead,effect transitive squash delay,conflict-aware commit step,atomic blocks,nondeterministic systems,optimized deterministic execution,simulation-based evaluation,speeds-up execution,transitive squash delay,atomic block,bulkcompactor design,determinism-enforcement architectures,strict round-robin order,round-robin scheme,shared memory systems,round-robin order,average execution overhead,code blocks,average overhead,hardware,merging,instruction sets,schedules
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要