LogRep: Log-based Anomaly Detection by Representing both Semantic and Numeric Information in Raw Messages

Xiaoda Xie,Songlei Jian,Chenlin Huang,Fengyuan Yu, Yunjia Deng

2023 IEEE 34th International Symposium on Software Reliability Engineering (ISSRE)(2023)

引用 0|浏览4
暂无评分
摘要
Log-based anomaly detection plays an essential role in various system reliability-related fields including software reliability, network reliability, and so on. System log data is a kind of semi-structured heterogeneous data that contains both semantic parts and numeric variables which both reflect the abnormal behavior of the system. However, existing log-based anomaly detection methods fail to capture the numeric information in raw data which makes them degrade a lot when only limited labeled data is available. To comprehensively capture the semantic and numeric information to enhance anomaly detection, we propose LogRep, a novel representation-based log anomaly detection method that captures both semantic and numeric information in the learned representations. The newly proposed position-aware numeric representation learning module and the attention-based representation fusion module in LogRep solve the heterogeneity problem well in log data. Due to the high quality of learned log representation, LogRep can achieve a comparable anomaly detection performance with SOTA methods while the training data used in LogRep is two orders of magnitude less than that used in SOTA methods. When reducing the training data scale, the performance of SOTA methods drops a lot, while LogRep keeps a stable good performance on two public HDFS dataset, BGL dataset, and one self-collected dataset. Specifically, LogRep achieves the 10.6% and 5.8% improvements over the second-best method in terms of F1 score on the BGL and HDFS datasets when only 1% training data are available respectively.
更多
查看译文
关键词
Log-based anomaly detection,Log representation learning,Limited training data,Log heterogeneity,Log data analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要