On-Demand Redundancy Grouping: Selectable Soft-Error Tolerance for a Multicore Cluster

2022 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)(2022)

引用 4|浏览3
暂无评分
摘要
With the shrinking of technology nodes and the use of parallel processor clusters in hostile and critical environments, such as space, run-time faults caused by radiation are a serious cross-cutting concern, also impacting architectural design. This paper introduces an architectural approach to run-time configurable soft-error tolerance at the core level, augmenting a six-core open-source RISC-V cluster with a novel On-Demand Redundancy Grouping (ODRG) scheme. ODRG allows the cluster to operate either as two fault-tolerant cores, or six individual cores for high-performance, with limited overhead to switch between these modes during run-time. The ODRG unit adds less than 11% of a core's area for a three-core group, or a total of 1% of the cluster area, and shows negligible timing increase, which compares favorably to a commercial state-of-the-art implementation, and is $2.5\times$ faster in fault recovery re-synchronization. Furthermore, when redundancy is not necessary, the ODRG approach allows the redundant cores to be used for independent computation, allowing up to $2.96\times$ increase in performance for selected applications.
更多
查看译文
关键词
Reliability,Adaptive Fault Tolerance,RISC-V,Space Vehicle Computers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要