Are we describing the same sound? An analysis of word embedding spaces of expressive piano performance
CoRR(2023)
摘要
Semantic embeddings play a crucial role in natural language-based information
retrieval. Embedding models represent words and contexts as vectors whose
spatial configuration is derived from the distribution of words in large text
corpora. While such representations are generally very powerful, they might
fail to account for fine-grained domain-specific nuances. In this article, we
investigate this uncertainty for the domain of characterizations of expressive
piano performance. Using a music research dataset of free text performance
characterizations and a follow-up study sorting the annotations into clusters,
we derive a ground truth for a domain-specific semantic similarity structure.
We test five embedding models and their similarity structure for correspondence
with the ground truth. We further assess the effects of contextualizing
prompts, hubness reduction, cross-modal similarity, and k-means clustering. The
quality of embedding models shows great variability with respect to this task;
more general models perform better than domain-adapted ones and the best model
configurations reach human-level agreement.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要