An Euclidean Distance Based on Tensor Product Graph Diffusion Related Attribute Value Embedding for Nominal Data Clustering.

THIRTY-SECOND AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTIETH INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / EIGHTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE(2018)

引用 23|浏览13
暂无评分
摘要
Not like numerical data clustering, nominal data clustering is a very difficult problem because there exists no natural relative ordering between nominal attribute values. This paper mainly aims to make the Euclidean distance measure appropriate to nominal data clustering, and the core idea is the attribute value embedding, namely, transforming each nominal attribute value into a numerical vector. This embedding method consists of four steps. In the first step, the weights, which can quantify the amount of information in attribute values, is calculated for each value in each nominal attribute based on each object and its k nearest neighbors. In the second step, an intra-attribute value similarity matrix is created for each nominal attribute by using the attribute value's weights. In the third step, for each nominal attribute, we find another attribute with the maximal dependence on it, and build an inter-attribute value similarity matrix on the basis of the attribute value's weights related to these two attributes. In the last step, a diffusion matrix of each nominal attribute is constructed by the tensor product graph diffusion process, and this step can cause the acquired value embedding to contain simultaneously the intra-and inter-attribute value similarities information. To evaluate the effectiveness of our proposed method, experiments are done on 10 data sets. Experimental results demonstrate that our method not only enables the Euclidean distance to be used for nominal data clustering, but also can acquire the better clustering performance than several existing state-of-the-art approaches.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要