A General Theory for Compositional Generalization
CoRR(2024)
Abstract
Compositional Generalization (CG) embodies the ability to comprehend novel
combinations of familiar concepts, representing a significant cognitive leap in
human intellectual advancement. Despite its critical importance, the deep
neural network (DNN) faces challenges in addressing the compositional
generalization problem, prompting considerable research interest. However,
existing theories often rely on task-specific assumptions, constraining the
comprehensive understanding of CG. This study aims to explore compositional
generalization from a task-agnostic perspective, offering a complementary
viewpoint to task-specific analyses. The primary challenge is to define CG
without overly restricting its scope, a feat achieved by identifying its
fundamental characteristics and basing the definition on them. Using this
definition, we seek to answer the question "what does the ultimate solution to
CG look like?" through the following theoretical findings: 1) the first No Free
Lunch theorem in CG, indicating the absence of general solutions; 2) a novel
generalization bound applicable to any CG problem, specifying the conditions
for an effective CG solution; and 3) the introduction of the generative effect
to enhance understanding of CG problems and their solutions. This paper's
significance lies in providing a general theory for CG problems, which, when
combined with prior theorems under task-specific scenarios, can lead to a
comprehensive understanding of CG.
MoreTranslated text
AI Read Science
Must-Reading Tree
Example
![](https://originalfileserver.aminer.cn/sys/aminer/pubs/mrt_preview.jpeg)
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined