Chrome Extension
WeChat Mini Program
Use on ChatGLM

Improved Bayesian Based Method for Classifying Disease Documents

2016 World Symposium on Computer Applications & Research (WSCAR)(2016)

Cited 4|Views0
No score
Abstract
Naïve Bayes has been proved to be decently competitive learning and classification approach in many fields and still been actively researched. We propose a Bayesian based classification method for biomedical disease-related documents. The proposed method relies on the difference in class distribution between the presence vs absence of the attributes. Specifically, in a simple inductive learning setting, the difference in class probability between the presence vs absence of feature fj can be a good metric for the contribution of fj in predicting the class. The proposed method works well with biomedical text abstracts as attribute values (feature count) of word features are not high. We found that heavy medical terms tends to occur with fairly low frequencies in these abstracts but have significant contribution in determining the class and the subject of the document. Therefore, this technique is suitable for biomedical text mining because it gives rise to terms with low per-document frequency and such terms play a good role in predicting the class in biomedical texts. The evaluation is conducted with seven datasets and compared to the Bayesian method as our baseline using accuracy and AUC with encouraging results, and the proposed method outperformed the baseline significantly. Moreover, we investigated the effect of low average frequency terms and their contribution in classification accuracy.
More
Translated text
Key words
biomedical text classification,biomedical text mining,text categorization,Bayesian classification
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined