Towards Efficient I/O Scheduling for Collaborative Multi-Level Checkpointing

2021 29th International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems (MASCOTS)(2021)

引用 3|浏览1
暂无评分
摘要
Efficient checkpointing of distributed data structures periodically at key moments during runtime is a recurring fundamental pattern in a large number of uses cases: fault tolerance based on checkpoint-restart, in-situ or post-analytics, reproducibility, adjoint computations, etc. In this context, multilevel checkpointing is a popular technique: distributed processes can write their shard of the d...
更多
查看译文
关键词
Checkpointing,Analytical models,Runtime,Processor scheduling,Computational modeling,Graphics processing units,Collaboration
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要