Deep feature weighting for naive Bayes and its application to text classification.

Eng. Appl. of AI(2016)

引用 319|浏览85
暂无评分
摘要
Naive Bayes (NB) continues to be one of the top 10 data mining algorithms due to its simplicity, efficiency and efficacy. Of numerous proposals to improve the accuracy of naive Bayes by weakening its feature independence assumption, the feature weighting approach has received less attention from researchers. Moreover, to our knowledge, all of the existing feature weighting approaches only incorporate the learned feature weights into the classification of formula of naive Bayes and do not incorporate the learned feature weights into its conditional probability estimates at all. In this paper, we propose a simple, efficient, and effective feature weighting approach, called deep feature weighting (DFW), which estimates the conditional probabilities of naive Bayes by deeply computing feature weighted frequencies from training data. Empirical studies on a collection of 36 benchmark datasets from the UCI repository show that naive Bayes with deep feature weighting rarely degrades the quality of the model compared to standard naive Bayes and, in many cases, improves it dramatically. Besides, we apply the proposed deep feature weighting to some state-of-the-art naive Bayes text classifiers and have achieved remarkable improvements.
更多
查看译文
关键词
Naive Bayes,Feature weighting,Correlation-based feature selection,Text classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要