TriSig: Evaluating the statistical significance of triclusters

PATTERN RECOGNITION(2024)

引用 0|浏览0
暂无评分
摘要
Tensor data analysis allows researchers to uncover novel patterns and relationships that cannot be obtained from tabular data alone. The information inferred from multi-way patterns can offer valuable insights into disease progression, bioproduction processes, behavioral responses, weather fluctuations, or social dynamics. However, spurious patterns often hamper this process. This work aims at proposing a statistical frame to assess the probability of patterns in tensor data to deviate from null expectations, extending well-established principles for assessing the statistical significance of patterns in tabular data. A principled discussion on binomial testing to mitigate false positive discoveries is entailed at the light of: variable dependencies, temporal associations and misalignments, and multi-hypothesis correction. Results gathered from the application of triclustering algorithms over distinct real-world case studies in biotechnological domains confer validity to the proposed statistical frame while revealing vulnerabilities of reference triclustering searches. The proposed assessment can be incorporated into existing triclustering algorithms to minimize spurious occurrences, rank patterns, and further prune the search space, reducing their computational complexity.
更多
查看译文
关键词
Triclustering,Pattern discovery,Statistical significance,Temporal pattern mining,Multivariate time series data
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要