Hierarchical Text Classification for News Articles Based-on Named Entities.

ADMA(2012)

引用 27|浏览17
暂无评分
摘要
There exist a range of hierarchical text classification approaches that classify text documents into a pre-constructed hierarchy of categories. In these approaches, feature selections are often based on terms (words or phrases), which are unsuitable for hierarchically classifying news articles. Named entities are informative features in news articles which have not been studied seriously in previous hierarchical text classification approaches. This paper utilizes named entities as features for classifying news articles into a pre-constructed hierarchy about international relations. The feature selection is implemented based on named entities associated with local categories. Documents are then represented by the selected features using two types of models, which are Boolean model and Vector model. We train SVMs corresponding to both types of models based-on local information. The experimental results show that the use of named entities improves the performance of hierarchical text classification for news articles. © Springer-Verlag 2012.
更多
查看译文
关键词
feature selection,hierarchical text classification,named entity,support vector machine
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要