Modulo: Finding Convergence Failure Bugs in Distributed Systems with Divergence Resync Models

USENIX Annual Technical Conference (USENIX ATC)(2022)

引用 2|浏览19
暂无评分
摘要
While there exist many consistency models for distributed systems, most of those models seek to provide the basic guarantee of convergence: given enough time and no further inputs, all replicas in the system should eventually converge to the same state. However, because of Convergence Failure Bugs (CFBs), many distributed systems do not provide even this basic guarantee. The violation of the convergence property can be crucial to safety-critical applications collectively working together with a shared distributed system. Indeed, many CFBs are reported as major issues by developers. Our key insight is that CFBs are caused by divergence, or differences between the state of replicas, and that a focused exploration of divergence states can reveal bugs in the convergence logic of real distributed systems while avoiding state explosion. Based on this insight, we have designed and implemented Modulo, the first Model-Based Testing tool using Divergence Resync Models (DRMs) to systematically explore divergence and convergence in real distributed systems. Modulo uses DRMs to explore an abstract state machine of the system and derive schedules, the intermediate representation of test cases, which are then translated into test inputs and injected into systems under test (SUTs). We ran Modulo to check ZooKeeper, MongoDB, and Redis and found 11 bugs (including 6 previously unknown ones)
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要