Adaptive Cache Coherence Mechanisms with Producer–Consumer Sharing Optimization for Chip Multiprocessors

Computers, IEEE Transactions  (2015)

引用 17|浏览31
暂无评分
摘要
In chip multiprocessors (CMPs), maintaining cache coherence can account for a major performance overhead. Write-invalidate protocols adapted by most CMPs generate high cache-to-cache misses under producer-consumer sharing patterns. Accordingly, this paper presents three cache coherence mechanisms optimized for CMPs. First, to reduce coherence misses observed in write-invalidate-based protocols, we propose a dynamic write-update mechanism augmented on top of a write-invalidate protocol. This mechanism is specifically triggered at the detection of a producer-consumer sharing pattern. Second, we extend this adaptive protocol with a bandwidth-adaptive mechanism to eliminate performance degradation from write-updates under limited bandwidth. Finally, proximity-aware mechanism is proposed to extend the base adaptive protocol with latency-based optimizations. Experimental analysis is conducted on a set of scientific applications from the SPLASH-2 and NAS parallel benchmark suites. The proposed mechanisms were shown to reduce coherence misses by up to 48% and in return speed up application performance up to 30%. Bandwidth-adaptive mechanism is proven to perform well under varying levels of available bandwidth. Results from our proposed proximity-aware extension demonstrated up to 6% performance gain over the base adaptive protocol for 64-core tiled CMP runs. In addition, the analytical model provided good estimates for performance gains from our adaptive protocols.
更多
查看译文
关键词
cache storage,multiprocessing systems,parallel architectures,performance evaluation,64-core tiled cmp,nas parallel benchmark suites,splash-2 suites,adaptive cache coherence mechanisms,bandwidth-adaptive mechanism,base adaptive protocol,cache-to-cache misses,chip multiprocessors,dynamic write-update mechanism,latency-based optimizations,performance degradation,producer-consumer sharing optimization,producer-consumer sharing pattern,proximity-aware extension,write-invalidate-based protocols,cache coherence,adaptable architectures,chip multiprocessors (cmps),producer/consumer,optimization,bandwidth,producer consumer,radiation detectors,coherence,multicore processing,protocols
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要