CMDiagnostor: An Ambiguity-Aware Root Cause Localization Approach Based on Call Metric Data

WWW 2023(2023)

引用 2|浏览57
暂无评分
摘要
The availability of online services is vital as its strong relevance to revenue and user experience. To ensure online services’ availability, quickly localizing the root causes of system failures is crucial. Given the high resource consumption of traces, call metric data are widely used by existing approaches to construct call graphs in practice. However, ambiguous correspondences between upstream and downstream calls may exist and result in exploring unexpected edges in the constructed call graph. Conducting root cause localization on this graph may lead to misjudgments of real root causes. To the best of our knowledge, we are the first to investigate such ambiguity, which is overlooked in the existing literature. Inspired by the law of large numbers and the Markov properties of network traffic, we propose a regression-based method (named AmSitor) to address this problem effectively. Based on AmSitor, we propose an ambiguity-aware root cause localization approach based on Call Metric Data named CMDiagnostor, containing metric anomaly detection, ambiguity-free call graph construction, root cause exploration, and candidate root cause ranking modules. The comprehensive experimental evaluations conducted on real-world datasets show that our CMDiagnostor can outperform the state-of-the-art approaches by 14% on the top-5 hit rate. Moreover, AmSitor can also be applied to existing baseline approaches separately to improve their performances one step further. The source code is released at https://github.com/NetManAIOps/CMDiagnostor.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要