CodeT5Mix: A Pretrained Mixture of Encoder-decoder Transformers for Code Understanding and Generation

ICLR 2023(2023)

引用 0|浏览127
暂无评分
摘要
Pretrained language models (LMs) trained on vast source code have achieved prominent progress in a wide range of code intelligence tasks. Despite their success, they either adopt specific types of network architectures (encoder-only or decoder-only) for different downstream tasks or rely on a single architecture (encoder-decoder or UniLM-style encoder) for all tasks. The latter approach usually results in a sub-optimal performance on a subset of tasks. To address these limitations, we propose “CodeT5Mix”, a mixture of encoder-decoder Transformers for code where its components can be flexibly combined based on the target tasks during finetuning, while still enjoying the mutual benefits from the joint pretraining. To endow the model with both code understanding and generation capabilities, we pretrain CodeT5Mix using a mixture of denoising, contrastive learning, matching, and Causal Language Modeling (CLM) tasks on large-scale multilingual code corpora in nine programming languages. Additionally, we design a weight sharing strategy in decoders except the feedforward layers, which act as task-specific experts to reduce the interference across tasks of various types. We extensively evaluate CodeT5Mix on seven tasks in four different modes and achieve state-of-the-art (SoTA) performance on most tasks such as text-to-code retrieval, code completion and generation, and math programming. Particularly, we demonstrate that CodeT5Mix can be used as a unified semi-parametric retrieval-augmented generator with SoTA code generation performance.
更多
查看译文
关键词
Language model pretraining,multimodal learning,code understanding and generation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要