LPV: A Log Parsing Framework Based on Vectorization

IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT(2023)

引用 1|浏览26
暂无评分
摘要
Logs are pervasive in modern computing systems, and are valuable to service and system management. Nevertheless, with the rapidly growing size and complexity of computing systems, the log volume is exploding, which makes automatic log analysis imperative. Generally, in automatic log analysis, the first and fundamental step is log parsing, to which a lot of effort has been devoted. However, in most existing log parsing methods, log messages are merely treated as plain text. In natural language processing (NLP) area, it is a common practice to represent words and sentences with vectors, then the similarity between two words or sentences can be measured by the distance between their vectors. Inspired by these, we put forward a novel log parsing framework, named LPV (Log Parser based on Vectorization), which performs log parsing by converting log messages and log templates into vectors, with the help of a vectorization method in NLP. LPV incorporates offline and online log parsing. In the offline log parsing, the central idea is to first represent log messages with vectors, so that the similarity between two log messages can be measured by the distance between their vectors, then we cluster log messages via clustering the vectors, and finally we extract log templates from the resultant clusters. By the end of the offline log parsing, each log template is assigned with an average vector, so that in the online log parsing, the similarity between an incoming log message and each log template can also be measured by the distance between their vectors. Extensive experiments have been conducted based on several public log datasets to evaluate LPV with three different vectorization methods. The results demonstrate that, with a proper vectorization method, LPV performs competitive with state-of-the-art log parsing methods, in both effectiveness and efficiency.
更多
查看译文
关键词
Log parsing,log template extraction,log analysis,vectorization,service and system management
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要