A novel Android malware detection method with API semantics extraction

COMPUTERS & SECURITY(2024)

引用 0|浏览15
暂无评分
摘要
Due to the continuous evolution of both the Android framework and malware, conventional malware detection methods that have been trained using outdated apps are inadequate in effectively identifying sophisticated evolved malware. To address this issue, in this paper, we propose a novel Android malware detection method with API semantics extraction (AMDASE), it can effectively identify evolved malware instances. Firstly, AMDASE performs API clustering to obtain cluster centers representing API functions before malware detection. We design API sentence to summarize API features and employ natural language processing (NLP) tools to acquire embeddings of API sentence for clustering. With the help of API sentence, it becomes possible to effectively extract the semantics of API contained in features like method name that accurately represents its intended functionality, which also makes the clustering results more accurate. Secondly, AMDASE extracts call graph from each app and optimizes the call graph by removing nodes corresponding to unknown functions, while ensuring the preservation of connectivity between their predecessor and successor nodes. The optimized call graph can extract more robust API contextual information that accurately represents the behavior of each app. Thirdly, in order to maintain resilience against the evolution of Android malware, AMDASE extracts function call pairs from the optimized call graph and abstracts the APIs in function call pairs into cluster centers obtained in API clustering. Finally, feature vectors are generated using one-hot mapping and machine learning classifiers are used for malware detection. We evaluate AMDASE on a dataset of 42,154 benign and 42,450 malicious apps developed over a seven-year period. The experimental results demonstrate that AMDASE greatly outperforms the existing state-of-the-art methods and has a significantly slower aging speed.
更多
查看译文
关键词
Evolved malware detection,Model aging,API semantics,Function call graph,Machine learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要