Are ID Embeddings Necessary? Whitening Pre-trained Text Embeddings for Effective Sequential Recommendation
CoRR(2024)
摘要
Recent sequential recommendation models have combined pre-trained text
embeddings of items with item ID embeddings to achieve superior recommendation
performance. Despite their effectiveness, the expressive power of text features
in these models remains largely unexplored. While most existing models
emphasize the importance of ID embeddings in recommendations, our study takes a
step further by studying sequential recommendation models that only rely on
text features and do not necessitate ID embeddings. Upon examining pretrained
text embeddings experimentally, we discover that they reside in an anisotropic
semantic space, with an average cosine similarity of over 0.8 between items. We
also demonstrate that this anisotropic nature hinders recommendation models
from effectively differentiating between item representations and leads to
degenerated performance. To address this issue, we propose to employ a
pre-processing step known as whitening transformation, which transforms the
anisotropic text feature distribution into an isotropic Gaussian distribution.
Our experiments show that whitening pre-trained text embeddings in the
sequential model can significantly improve recommendation performance. However,
the full whitening operation might break the potential manifold of items with
similar text semantics. To preserve the original semantics while benefiting
from the isotropy of the whitened text features, we introduce WhitenRec+, an
ensemble approach that leverages both fully whitened and relaxed whitened item
representations for effective recommendations. We further discuss and analyze
the benefits of our design through experiments and proofs. Experimental results
on three public benchmark datasets demonstrate that WhitenRec+ outperforms
state-of-the-art methods for sequential recommendation.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要