Efficient mining of top-k high utility itemsets through genetic algorithms

Information Sciences(2023)

引用 7|浏览48
暂无评分
摘要
Mining high utility itemsets is an emerging and very active research area in data mining. The goal is to mine all itemsets with a utility value, in terms of importance to the user, no less than a predefined threshold value. Setting an appropriate threshold value is not trivial, requiring not only multiple trials but also the know-how in the application field. The advantage of algorithms for mining top-k high utility itemsets is they do not require such a utility threshold, but they suffer from very long runtimes and large memory requirements when large input data is considered. We propose a new genetic algorithm for mining top-k high utility itemsets, named TKHUIM-GA (Top-K High Utility Itemset Mining through Genetic Algorithms). It guides the search process by considering the utility of each item to produce initial solutions and to combine solutions accordingly, reducing the runtime and memory consumption as a result. A highly efficient data representation is utilized to reduce memory usage and runtime. A key advantage of TKHUIM-GA is that it works on positive, negative, integer and real unit utility values unlike existing approaches. Experiments on popular benchmark datasets demonstrate the high performance of the proposal regarding the state-of-the-art algorithms.
更多
查看译文
关键词
High utility itemsets,Top-k utility mining,Genetic algorithms
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要