Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers
arxiv(2023)
摘要
The common modus operandi of fine-tuning large pre-trained Transformer models
entails the adaptation of all their parameters (i.e., full fine-tuning). While
achieving striking results on multiple tasks, this approach becomes unfeasible
as the model size and the number of downstream tasks increase. In natural
language processing and computer vision, parameter-efficient approaches like
prompt-tuning and adapters have emerged as solid alternatives by fine-tuning
only a small number of extra parameters, without sacrificing performance
accuracy. Specifically, adapters, due to their flexibility, have recently
garnered significant attention, leading to several variants. For audio
classification tasks, the Audio Spectrogram Transformer model shows impressive
results. However, surprisingly, how to efficiently adapt it to several
downstream tasks has not been tackled before. In this paper, we bridge this gap
and present a detailed investigation of common parameter-efficient methods,
revealing that adapters consistently outperform the other methods across four
benchmarks. This trend is also confirmed in few-shot learning settings and when
the total number of trainable parameters increases, demonstrating adapters
superior scalability. We finally study the best adapter configuration, as well
as the role of residual connections in the learning process. Our code is
available at: https://github.com/umbertocappellazzo/PETL AST.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要