Automatic Code Annotation Generation Based on Heterogeneous Graph Structure

2023 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER)(2023)

Cited 0|Views83
No score
Abstract
Automatic code annotation generation aims to generate readable annotations that describe the functionality of source code, which may facilitate software developers and programmers. Previous methods follow the encoder-decoder structures where the encoders are based on the abstract syntax trees (ASTs) to encode syntactic structures of code fragments. However, the AST alone cannot fully express complicated control structures, data flows, or dependencies of source code, leading to sub-optimal annotations. On the other hand, a functionality can be implemented in various ways with possibly different structures and token names. Most methods treat code fragments independently and do not exploit these similarities among code fragments. In this paper, we present HANCode2Seq, an automatic code annotation generation method by utilizing the code heterogeneous representation graph. Specifically, we construct the heterogeneous graph by combining multiple code induced graphs, including abstract syntax trees, control flow graphs, data flow graphs, and program dependency graphs. Then a heterogeneous graph attention network is applied to extract the comprehensive semantic meanings and syntactic structures of the source code fragments. Furthermore, we present a novel adaptive code similarity graph with code fragments being nodes. The representation of a code fragment is enhanced by aggregating information from other similar fragments on the graph, which may reduce the ambiguity of the code. The experimental results on real datasets show that our proposed model outperforms other baselines and produces more fluent and readable code annotations.
More
Translated text
Key words
code annotation generation,heterogeneous graph,graph attention network
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined