Recovery Algorithms for Paxos-Based State Machine Replication

Periodicals(2021)

引用 18|浏览18
暂无评分
摘要
AbstractIn this article, we propose and evaluate three different state recovery algorithms aimed for Paxos—one of the most popular distributed agreement protocols. Paxos is commonly used to maintain consistency among state machine replicas despite of failures of processes. The first algorithm, that we call FullSS, originates from the original Paxos and requires that the system frequently uses stable storage during regular (non-faulty) execution. The other two state recovery algorithms, ViewSS and EpochSS, scarcely require access to stable storage, and the recovering process must do much less work to restore its lost state, and to catch up on the current state of the system. We thoroughly analyze and compare the behavior of the three algorithms during state recovery and also during regular, non-faulty system execution, under various workloads (e.g., causing the network or CPU saturation). The experimental results show that by using ViewSS and EpochSS, we can significantly improve process recovery with respect to the original Paxos, if only it can be assumed that at any time a majority of replicas are up running (excluding those replicas that are just recovering). Moreover, these algorithms do not impact the performance of Paxos during regular (non-faulty) operation. However, FullSS is the only choice out of the three, if the system must tolerate catastrophic failures.
更多
查看译文
关键词
Computer crashes, Protocols, Fault tolerance, Fault tolerant systems, System performance, Writing, Distributed algorithms, Paxos, state machine replication, fault-tolerance
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要