Evaluating an XOR-based Hybrid Fault Tolerance Technique to Detect Faults in GPU Pipelines
2023 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)(2023)
摘要
Graphics Processing Units are consistently reaching new applications due to their massive parallel execution architectures. However, some safety-critical areas, such as avionics, come with unfriendly environments due to radiation effects caused by cosmic rays, effectively causing component failures. This work implements and tests a hybrid fault tolerance technique initially proposed by NVIDIA to protect a GPU’s pipeline against radiation effects. Results show that the technique can be effective against data-flow errors but at a high cost in execution time overheads and potentially increased control-flow errors.
更多查看译文
关键词
Fault tolerance, graphics processing units, pipeline, single event upsets
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要