Similarity Grouping by Influence: Exploring Result Diversification in Similarity Group-by Operators.

SBBD(2023)

引用 0|浏览0
暂无评分
摘要
The group-by operator groups the tuples sharing the same values in specified attributes, then extracts summaries from each group. However, several data stored by modern applications are best queried not by equality but by similarity, giving rise to a number of questions, such as: "How to obtain groups, such that each one contains the k tuples most similar?" or "How to include diversity in the results?". In this paper, we present a binary grouping operator focused on diversified similarity comparisons, which is able to answer such questions. We define the operator algebraically and show its applicability to enable the execution of grouping operations over complex attributes, such as multidimensional data. We provide an algorithm, called Similarity Grouping by Influence -- SGIa --- to implement the binary operator. An experimental evaluation performed on real data shows the SGIa is able to timely meet real application needs with significant results.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要