TALENT: Targeted Mining of Non-overlapping Sequential Patterns

CoRR(2023)

引用 0|浏览11
暂无评分
摘要
With the widespread application of efficient pattern mining algorithms, sequential patterns that allow gap constraints have become a valuable tool to discover knowledge from biological data such as DNA and protein sequences. Among all kinds of gap-constrained mining, non-overlapping sequence mining can mine interesting patterns and satisfy the anti-monotonic property (the Apriori property). However, existing algorithms do not search for targeted sequential patterns, resulting in unnecessary and redundant pattern generation. Targeted pattern mining can not only mine patterns that are more interesting to users but also reduce the unnecessary redundant sequence generated, which can greatly avoid irrelevant computation. In this paper, we define and formalize the problem of targeted non-overlapping sequential pattern mining and propose an algorithm named TALENT (TArgeted mining of sequentiaL pattErN with consTraints). Two search methods including breadth-first and depth-first searching are designed to troubleshoot the generation of patterns. Furthermore, several pruning strategies to reduce the reading of sequences and items in the data and terminate redundant pattern extensions are presented. Finally, we select a series of datasets with different characteristics and conduct extensive experiments to compare the TALENT algorithm with the existing algorithms for mining non-overlapping sequential patterns. The experimental results demonstrate that the proposed targeted mining algorithm, TALENT, has excellent mining efficiency and can deal efficiently with many different query settings.
更多
查看译文
关键词
targeted mining,patterns,non-overlapping
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要