Viewing functions as token sequences to highlight similarities in source code

Science of Computer Programming(2013)

引用 8|浏览0
暂无评分
摘要
The detection of similarities in source code has applications not only in software re-engineering (to eliminate redundancies) but also in software plagiarism detection. This later can be a challenging problem since more or less extensive edits may have been performed on the original copy: insertion or removal of useless chunks of code, rewriting of expressions, transposition of code, inlining and outlining of functions, etc. In this paper, we propose a new similarity detection technique not only based on token sequence matching but also on the factorization of the function call graphs. The factorization process merges shared chunks (factors) of codes to cope, in particular, with inlining and outlining. The resulting call graph offers a view of the similarities with their nesting relations. It is useful to infer metrics quantifying similarity at a function level.
更多
查看译文
关键词
function call graph,software plagiarism detection,extensive edit,factorization process,source code,token sequence,challenging problem,software re-engineering,resulting call graph,function level,new similarity detection technique,viewing function,similarity,call graph,factorization,clones,inlining,duplication
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要