Towards Detecting Anomalies in Log-Event Sequences with Deep Learning: Open Research Challenges

PROCEEDINGS OF THE 2023 EUROPEAN INTERDISCIPLINARY CYBERSECURITY CONFERENCE, EICC 2023(2023)

引用 0|浏览16
暂无评分
摘要
Anomaly Detection (AD) is an important area to reliably detect malicious behavior and attacks on computer systems. Log data is a rich source of information about systems and thus provides a suitable input for AD. With the sheer amount of log data available today, Machine Learning (ML) and its further development Deep Learning (DL) have been applied for years to create models for AD. Especially when processing complex log data, DL is often able to achieve better performance than ML. To detect anomalous patterns that span over multiple log lines, it is necessary to group these log lines into log-event sequences. This work uses a Long Short-Term Memory (LSTM) model for AD which is one of the most important approaches to represent long-range temporal dependencies in log-event sequences of arbitrary length. This means that we use past information to predict whether future events are normal or anomalous. For the LSTM model we adapt a state of the art open source implementation called LogDeep. For the evaluation, we use a Hadoop Distributed File System (HDFS) data set, which is well studied in current research, and an open source Audit data set provided by the Austrian Institute of Technology (AIT). In this paper we show that without padding, a common preprocessing step used that strongly influences the AD process and artificially improves detection results and thus accuracy in lab testing, it is not possible to achieve the same high quality of results shown in literature. Furthermore, we analyze limitations of DL approaches applied for AD and list future research priorities and design challenges.
更多
查看译文
关键词
log event sequences,anomaly detection,deep learning,lstm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要