Soft Set Based Clustering and Its Comparison on Categorical Data
2023 IEEE 9th Information Technology International Seminar (ITIS)(2023)
摘要
Categorical data clustering is problematic since it is difficult or complex to determine how comparable the data is. Several methods, most recently centroid-based strategies, have been developed to reduce the complexity of the similarity of categorical data. These methods nevertheless result in lengthy processing durations. Another method, soft set-based clustering (SSC), based on the probability function of multivariate multinomial distributions, is suggested in this article. Soft sets are used to represent the data, and each soft set has a probability for each object. The joint cluster distribution function determines the probability for each object after the multivariate multinomial distribution function. The connected cluster would receive the highest likelihood. Benchmark data sets from UCI machine learning are used to compare the performance of the approach to the baseline techniques. The outcomes demonstrate that the suggested strategy performed better in purity, rank index, and calculation time.
更多查看译文
关键词
Soft set,categorical data,multinomial distribution
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要