谷歌Chrome浏览器插件
订阅小程序
在清言上使用

CloudRaid: Detecting Distributed Concurrency Bugs via Log Mining and Enhancement

IEEE Transactions on Software Engineering(2022)

引用 9|浏览126
暂无评分
摘要
Cloud systems suffer from distributed concurrency bugs, which often lead to data loss and service outage. This paper presents CloudRaid , a new automatical tool for finding distributed concurrency bugs efficiently and effectively. Distributed concurrency bugs are notoriously difficult to find as they are triggered by untimely interaction among nodes, i.e., unexpected message orderings. To detect concurrency bugs in cloud systems efficiently and effectively, CloudRaid analyzes and tests automatically only the message orderings that are likely to expose errors. Specifically, CloudRaid mines the logs from previous executions to uncover the message orderings that are feasible but inadequately tested. In addition, we also propose a log enhancing technique to introduce new logs automatically in the system being tested. These extra logs added improve further the effectiveness of CloudRaid without introducing any noticeable performance overhead. Our log-based approach makes it well-suited for live systems. We have applied CloudRaid to analyze six representative distributed systems: Hadoop2/Yarn, HBase, HDFS, Cassandra, Zookeeper, and Flink. CloudRaid has succeeded in testing 60 different versions of these six systems (10 versions per system) in 35 hours, uncovering 31 concurrency bugs, including nine new bugs that have never been reported before. For these nine new bugs detected, which have all been confirmed by their original developers, three are critical and have already been fixed.
更多
查看译文
关键词
Distributed systems,concurrency bugs,bug detection,cloud computing
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要