Data Augmentation using Large Language Models: Data Perspectives, Learning Paradigms and Challenges
arxiv(2024)
摘要
In the rapidly evolving field of large language models (LLMs), data
augmentation (DA) has emerged as a pivotal technique for enhancing model
performance by diversifying training examples without the need for additional
data collection. This survey explores the transformative impact of LLMs on DA,
particularly addressing the unique challenges and opportunities they present in
the context of natural language processing (NLP) and beyond. From both data and
learning perspectives, we examine various strategies that utilize LLMs for data
augmentation, including a novel exploration of learning paradigms where
LLM-generated data is used for diverse forms of further training. Additionally,
this paper highlights the primary open challenges faced in this domain, ranging
from controllable data augmentation to multi-modal data augmentation. This
survey highlights a paradigm shift introduced by LLMs in DA, and aims to serve
as a comprehensive guide for researchers and practitioners.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要