A new tree-based approach to mine sequential patterns

EXPERT SYSTEMS WITH APPLICATIONS(2024)

引用 0|浏览12
暂无评分
摘要
Generic sequential pattern mining problem aims to mine the set of sequential patterns from a sequential database that satisfies a minimum support or occurrence threshold constraint. The main challenges that affect the efficiency of a solution lie in reducing the pattern search space, early detecting the infrequent patterns, representing the database in an efficient format, etc. Also, additional challenges get included when the problem environment transitions from static to incremental database leading to not to re-mine but efficiently tracking the effect of the incremental portion over the complete updated database. In this article, we introduce a new tree-based solution to the sequential pattern mining problem, including two sets of novel solutions for static and incremental sequential databases. We propose two new structures, SP-Tree and IncSP-Tree, and design two efficient algorithms, Tree-Miner and IncTree-Miner to mine the complete set of sequential patterns from static and incremental databases respectively. The proposed novel structures provide an efficient manner to store the complete sequential database maintaining "build-once-mine-many" property and giving scope to perform interactive mining. Additionally, we also design a new breath-first based support counting technique to efficiently identify the infrequent patterns at early stages and a new heuristic pruning strategy to reduce pattern search space. We also design a new pattern storage structure BPFSP-Tree to store the frequent patterns during successive iterations in incremental mining to reduce the number of database scans and to remove the infrequent patterns efficiently. A novel structure named Sequence Summarizer is also introduced to efficiently calculate and update the co-occurrence information of the items, especially in an incremental environment. Experimental results from various real-life and synthetic datasets demonstrate the efficiency of our work in comparison with the related state-of-the-art approaches.
更多
查看译文
关键词
Sequential pattern,Tree-based mining,Incremental mining,Breadth-first based pruning,Pattern storage
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要