Improving Malicious PDF Detection with a Robust Stacking Ensemble Approach

Ahmed Haj Abdel Khaleq,Miguel A. Garzon

2023 20th Annual International Conference on Privacy, Security and Trust (PST)（2023）

引用 0|浏览2

暂无评分

摘要

In recent years, the increasing prevalence of malicious PDF files has become a major cybersecurity concern. Despite advances in machine learning-based detection methods, cyber attackers continue to develop novel techniques for evading detection, necessitating the development of more robust and effective models. In this paper, we propose a stacking ensemble model that is used as part of a framework for robustness against feature-specific attacks. Additionally, we address the limitations of the PDFMal-2022 dataset by enhancing the feature extraction module and resolving issues related to flawed values and mismatched mapping, creating an improved dataset with additional features, the enhanced PDFMal-2022. We evaluate our model's performance on the widely used Contagio dataset and compare it to three state-of-the-art approaches. Moreover, we provide a benchmark performance of the proposed model on the enhanced PDFMal-2022 dataset, validating its effectiveness in a more challenging setting. And we demonstrate our proposed framework's performance and robustness against malicious PDFs from the QAKBOT and ICEDID malware campaigns. Our proposed stacking ensemble model and the enhanced PDFMal-2022 dataset contribute to the field of malicious PDF detection, providing a valuable asset in combating PDF-based cyber threats.

查看译文

关键词

Stacking Ensembles,Malicious PDF Detection,Malicious PDF Detection Dataset

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要