Improving Security Tasks Using Compiler Provenance Information Recovered At the Binary-Level

PROCEEDINGS OF THE 2023 ACM SIGSAC CONFERENCE ON COMPUTER AND COMMUNICATIONS SECURITY, CCS 2023(2023)

引用 0|浏览13
暂无评分
摘要
The complex optimizations supported by modern compilers allow for compiler provenance recovery at many levels. For instance, it is possible to identify the compiler family and optimization level used when building a binary, as well as the individual compiler passes applied to functions within the binary. Yet, many downstream applications of compiler provenance remain unexplored. To bridge that gap, we train and evaluate a multi-label compiler provenance model on data collected from over 27,000 programs built using LLVM 14, and apply the model to a number of security-related tasks. Our approach considers 68 distinct compiler passes and achieves an average F-1 score of 84.4%. We first use the model to examine the magnitude of compiler-induced vulnerabilities, identifying 53 information leak bugs in 10 popular projects. We also show that several compiler optimization passes introduce a substantial amount of functional code reuse gadgets that negatively impact security. Beyond vulnerability detection, we evaluate other security applications, including using recovered provenance information to verify the correctness of Rich header data in Windows binaries (e.g., forensic analysis), as well as for binary decomposition tasks (e.g., third party library detection).
更多
查看译文
关键词
Compiler-introduced security bugs,correctness-security gap
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要