Factors influencing K means algorithm

Shejuti Khan,S M Monzurur Rahman, M Faysal Tanim, Fizar Ahmed

International Journal of Computational Systems Engineering(2013)

引用 23|浏览1
暂无评分
摘要
Clustering is an unsupervised learning technique. K-means is one of the most popular clustering algorithms. K-means requires the number of clusters to be pre-specified. Finding the appropriate number of clusters for a dataset is a trial-and-error process made more difficult by the subjective nature of deciding what constitutes ‘correct’ clustering (Han and Kamber, 2000). The aim of K-means is to group the items into k clusters such that all items in same cluster are as similar to each other and items not in same cluster are as dissimilar as possible. Different distance measures can be applied to calculate similarity. Improving the performance of K means can be a very useful and make better clustering. Improvement of performance depends on factors which we need to explore and measure with experiments. Our paper has done this and studied and identified five influential factors for the performance improvement of K-means.
更多
查看译文
关键词
cluster,unit vector,convergence,distance equation,encoding,k means
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要