MIPAD: Mini Program Analysis for Clone Detection using Static Analysis Techniques

Zhaohui Zhou, Ziqiang Yan,Yin Wang,Junfeng Liu, Jifei Shi,Ming Fan

2023 International Conference on Frontiers of Robotics and Software Engineering (FRSE)(2023)

引用 0|浏览2
暂无评分
摘要
In recent years, third-party platform-mounted applications, referred to as mini programs, such as health QR codes, transport codes, and utilities, have been gradually replacing traditional mobile applications due to their no-installation-uninstallation and use-it-and-go feature. However, the massive growth of mini programs has led to concerns about protecting the copyright of their code. Currently, there is not enough research on clone detection for mini programs, and the language features of mini programs make it difficult to detect plagiarism due to incomplete behaviour observation and challenges in calculating similarity. To address this gap, we propose MIPAD, a detection method based on static feature analysis, including statistical features (SF) for clustering analysis, layout features (LF), and code features (CFF, FDF, TLDF) for similarity detection. To enhance the robustness of the LF and CFF, FDF, TLDF features during the feature extraction phase, we used a fuzzy hash algorithm. To speed up the dependency graph similarity computation, we propose a fast anchor-based similarity computation algorithm. To address the lack of publicly available large sample datasets in this domain, we designed a mini program crawler method that can fuzzy crawl samples based on a seed list and expand the list in real-time, and we used this method to crawl 100,000-level mini program samples. Using these samples, we evaluated MIPAD using a Random Forest as a classifier and X-means as a clusterizer, which showed an accuracy of 90.5% and an average sample time overhead of 15. 83s, demonstrating that MIPAD can detect cloned mini programs quickly and effectively.
更多
查看译文
关键词
clone detection,mini program,static analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要