Energy-efficient Runtime Adaptive Scrubbing in fault-tolerant Network-on-Chips (NoCs) architectures

Computer Design(2013)

引用 12|浏览11
暂无评分
摘要
As Networks-on-Chips (NoCs) continue to become more susceptible to process variation, cross-talk, hard and soft errors with technology scaling to sub-nanometer, there is an urgent need for adaptive Error Correction Coding (ECC) schemes for improving the resiliency of the system. The goal of adaptive ECC schemes should be two fold; decrease power consumption when errors are infrequent, thereby maximizing power savings and increase the fault coverage when errors are frequent, thereby improving application speedup while consuming more power. In this paper, we propose Runtime Adaptive Scrubbing (RAS), a novel multi-layered error correction and detection scheme for Networks-on-Chips (NoCs) architectures that intelligently adjusts fault coverage at the physical layer using variable strength encoders to scrub (protect) flits, thereby preventing faults from accumulating and propagating up to the logical layer. RAS successfully permits graceful network degradation while improving the overall network speedup, fault granularity, and wider fault coverage than traditional static schemes. Simulation results indicate that RAS improves network latency by an average of 10% for Splash-2/PARSEC benchmarks on a 8 × 8 mesh network while incurring 6.6% power penalty per flit and saving 15% in area overhead.
更多
查看译文
关键词
adaptive codes,energy conservation,error correction codes,error statistics,fault tolerance,network-on-chip,power consumption,NoC architectures,RAS,Splash-2/PARSEC benchmarks,adaptive ECC schemes,adaptive error correction coding schemes,application speedup,cross-talk,energy-efficient runtime adaptive scrubbing,fault coverage,fault granularity,fault-tolerant network-on-chips architecture,graceful network degradation,hard error,logical layer,multilayered error correction scheme,multilayered error detection scheme,network latency,overall network speedup,physical layer,power consumption,power penalty,power savings,process variation,scrub flits,soft error,static schemes,subnanometer,system resiliency,technology scaling,variable strength encoders,ECC,Fault Tolerance,Network-on-Chip
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要