AST-Based LSTM Neural Network for Predicting Input Validation Vulnerabilities

ADVANCES IN CYBERSECURITY, CYBERCRIMES, AND SMART EMERGING TECHNOLOGIES(2023)

引用 0|浏览6
暂无评分
摘要
Due to the increased popularity of Web-based applications, input validation problems are becoming more common. Two input validation issues to be aware of are SQL injection (SQLi) and Cross-Site Scripting (XSS). Vulnerability prediction methods based on machine learning have lately increased in favor in the field of Web security. Due to the simplicity and efficiency of such procedures, they are becoming more popular. They usually make use of sophisticated graphs drawn from source code or highly proficient regex patterns to accomplish their goals. Essentially, tokenization is a technique of breaking down source code into a set of tokens to determine the vulnerability of a program's structure and flow using neural network methods. This paper proposed a model for predicting input validation vulnerabilities based on Abstract Syntax Tree (AST) and Long Short-Term Memory (LSTM) algorithm. The programs are translated into an AST structure, which is followed by a reduction in the number of nodes that are not linked to the program's vulnerabilities. The token sequence generated from AST is used as input for the LSTM model, which is used to understand the flow of vulnerabilities in the source code. The proposed model's accuracy was 96.44%, which was higher than the related studies.
更多
查看译文
关键词
Input validation vulnerability,XSS,SQLi,Minimal SSA,Deep learning,Tokenization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要