Does the Order Matter? A Random Generative Way to Learn Label Hierarchy for Hierarchical Text Classification

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING(2024)

引用 0|浏览17
暂无评分
摘要
Hierarchical Text Classification (HTC) is an essential and challenging task due to the difficulty of modeling label hierarchy. Recent generative methods have achieved state-of-the-art performance by flattening the local label hierarchy into a label sequence with a specific order. However, the order between labels does not naturally exist and the generation of the current label should incorporate the information in all other target labels. Moreover, the generative methods usually suffer from the error accumulation problem. To this end, we propose a new framework named sequence-to-label (Seq2Label) with a random generative way to learn label hierarchy for hierarchical text classification. Instead of using only one specific order, we shuffle the label sequence by a Label Sequence Random Shuffling (LSRS) mechanism so that a text will be mapped to several different order label sequences during the training phase. To alleviate the error accumulation problem, we further propose a Hierarchy-aware Negative Sampling (HNS) strategy with a negative label-aware loss to better distinguish target labels and negative labels. In this way, our model can capture the hierarchical and co-occurrence information of the target labels of each text. The experimental results on three benchmark datasets show that Seq2Label achieves state-of-the-art results.
更多
查看译文
关键词
Index Terms-Error accumulation,hierarchy-aware negative sampling,hierarchical text classification,label sequence random shuffling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要