Hybrid Feature-Based Multi-label Text Classification-A Framework

ADVANCES IN CYBERSECURITY, CYBERCRIMES, AND SMART EMERGING TECHNOLOGIES(2023)

引用 0|浏览16
暂无评分
摘要
Multi-label Text Classification (MLTC) as a problem is a scenario in which a text document can belong to one or more classes simultaneously. Such classification tasks pose several general as well as specific research challenges. The general challenges include dependency among classes, imbalanced data, and scalability in the presence of an excessive number of labels. On the other hand, the MLTC-specific challenges include high dimensional feature space, obtaining contextual and semantic knowledge from the text, and understanding content diversity. This paper provides a brief description of the multi-label classification approaches such as problem transformation, algorithm adaptation, and ensemble learning along with their strengths and weaknesses. Furthermore, we proposed an MLTC framework referred to as HMTCS (Hybrid feature-based Multi-label Text Classification System) that handles both general multi-labeling issues and text categorization-specific issues. The proposed framework has three modules, namely, Labels Knowledge Base, Hybrid Feature Extraction, and Ensemble Learning.
更多
查看译文
关键词
Multi-label text classification,Natural language processing,Ensemble learning,Deep learning
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要