Chrome Extension
WeChat Mini Program
Use on ChatGLM

A novel Android malware detection method with API semantics extraction

COMPUTERS & SECURITY(2024)

Cited 0|Views29
No score
Abstract
Due to the continuous evolution of both the Android framework and malware, conventional malware detection methods that have been trained using outdated apps are inadequate in effectively identifying sophisticated evolved malware. To address this issue, in this paper, we propose a novel Android malware detection method with API semantics extraction (AMDASE), it can effectively identify evolved malware instances. Firstly, AMDASE performs API clustering to obtain cluster centers representing API functions before malware detection. We design API sentence to summarize API features and employ natural language processing (NLP) tools to acquire embeddings of API sentence for clustering. With the help of API sentence, it becomes possible to effectively extract the semantics of API contained in features like method name that accurately represents its intended functionality, which also makes the clustering results more accurate. Secondly, AMDASE extracts call graph from each app and optimizes the call graph by removing nodes corresponding to unknown functions, while ensuring the preservation of connectivity between their predecessor and successor nodes. The optimized call graph can extract more robust API contextual information that accurately represents the behavior of each app. Thirdly, in order to maintain resilience against the evolution of Android malware, AMDASE extracts function call pairs from the optimized call graph and abstracts the APIs in function call pairs into cluster centers obtained in API clustering. Finally, feature vectors are generated using one-hot mapping and machine learning classifiers are used for malware detection. We evaluate AMDASE on a dataset of 42,154 benign and 42,450 malicious apps developed over a seven-year period. The experimental results demonstrate that AMDASE greatly outperforms the existing state-of-the-art methods and has a significantly slower aging speed.
More
Translated text
Key words
Evolved malware detection,Model aging,API semantics,Function call graph,Machine learning
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined