Weight Your Words - The Effect of Different Weighting Schemes on Wordification Performance.

Tatiana Sciammarella,Gerson Zaverucha

ILP(2019)

引用 0|浏览25
暂无评分
摘要
Relational data models are usually used by researchers and companies when a single-table model is not enough to describe their system. Then, when it comes to classification, there are mainly two options: apply the corresponding relational version of classification algorithms or use a propositionalization technique to transform the relational database into a single-table representation before classification. In this work, we evaluate a fast and simple propositionalization algorithm called Wordification. This technique uses the table name, attribute name and value to create a feature. Each feature is treated as a word and the instances of the database are represented by a Bag-Of-Words (BOW) model. Then, a weighting scheme is used to weight the features for each instance. The original implementation of Wordification only explored the TF-IDF, the term-frequency and the binary weighting schemes. However, works in the text classification and data mining fields show that the proper choice of weighting schemes can boost classification. Therefore, we empirically experimented different term weighting approaches with Wordification. Our results show that the right combination of weighting scheme and classification algorithm can significantly improve classification performance.
更多
查看译文
关键词
Propositionalization, Relational data mining, Term weighting, Wordification, Classification
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要