An Improved DBSCAN Method Considering Non-spatial Similarity by Using Min-Hash

Advances in Computer Science and Ubiquitous ComputingLecture Notes in Electrical Engineering(2021)

引用 0|浏览1
暂无评分
摘要
In data mining, there are several clustering algorithms that utilize a spatial attribute to group spatial objects on geometric space. However, a spatial object can have a non-spatial attribute as well as spatial attribute, but there are not many clustering algorithms that utilize both the spatial attribute and the non-spatial attribute yet. Jaccard similarity can be used as one of the ways in which similar spatial objects can be grouped by using the non-spatial attribute, but has the problem of higher calculation costs. Therefore, this paper proposes an improved DBSCAN method that utilizes the spatial attribute and non-spatial attribute in DBSCAN to actually cluster more similar Spatial Objects and uses Min-Hash to reduce the cost of calculating Jaccard similarity. The improved DBSCAN method we propose takes into account the similarity of non-spatial attribute in addition by using Min-Hash, when the neighborhood is obtained according to euclidean distance in the rangeQuery of existing DBSCAN process. We use real dataset to compare and analyze the results of classical DBSCAN with that of our method to demonstrate the applicability to real world and we use synthetic dataset composed with various experimental variables to compare performance of using Jaccard similarity with using Min-Hash.
更多
查看译文
关键词
Clustering, DBSCAN, Non-spatial attribute, Min-Hash
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要