Hierarchical Query Classification in E-commerce Search
arxiv(2024)
摘要
E-commerce platforms typically store and structure product information and
search data in a hierarchy. Efficiently categorizing user search queries into a
similar hierarchical structure is paramount in enhancing user experience on
e-commerce platforms as well as news curation and academic research. The
significance of this task is amplified when dealing with sensitive query
categorization or critical information dissemination, where inaccuracies can
lead to considerable negative impacts. The inherent complexity of hierarchical
query classification is compounded by two primary challenges: (1) the
pronounced class imbalance that skews towards dominant categories, and (2) the
inherent brevity and ambiguity of search queries that hinder accurate
classification.
To address these challenges, we introduce a novel framework that leverages
hierarchical information through (i) enhanced representation learning that
utilizes the contrastive loss to discern fine-grained instance relationships
within the hierarchy, called ”instance hierarchy”, and (ii) a nuanced
hierarchical classification loss that attends to the intrinsic label taxonomy,
named ”label hierarchy”. Additionally, based on our observation that certain
unlabeled queries share typographical similarities with labeled queries, we
propose a neighborhood-aware sampling technique to intelligently select these
unlabeled queries to boost the classification performance. Extensive
experiments demonstrate that our proposed method is better than
state-of-the-art (SOTA) on the proprietary Amazon dataset, and comparable to
SOTA on the public datasets of Web of Science and RCV1-V2. These results
underscore the efficacy of our proposed solution, and pave the path toward the
next generation of hierarchy-aware query classification systems.
更多查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要