A K-Means Clustering Algorithm - Using the Chi-Square as a Distance.

HCC(2018)

引用 1|浏览0
暂无评分
摘要
The recurrent use of databases with variables of the categorical type in different fields of science. Demands new approaches when using cluster analysis techniques on this type of database. For this reason, in this article we compare the function kmeans() of Matlab with a function K-Means implemented by us, with the addition that it has integrated a measure of similarity that the function of Matlab does not have, the distance chi-square, both algorithms were tested in databases with quantitative and categorical variables. The experimental results showed a higher level of classification success in favor of the function implemented by us, explaining the correct functioning of the implemented algorithm and demonstrating that the chi-square distance is the measure of appropriate similarity for categorical type databases.
更多
查看译文
关键词
Database, Cluster, K-means, Metric, Qualitative variable
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要