Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems
IEEE International Conference on High Performance Computing, Data, and Analytics(2011)
关键词
checkpointing,fault tolerance,parallel machines,checkpointing strategy,component failure dynamics,fault-tolerant algorithm,hardware failure,heterogeneous failure,large parallel system,supercomputing application
AI 理解论文
溯源树
样例

生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要