Efficient And Effective Explanation Of Change In Hierarchical Summaries

KDD(2007)

引用 57|浏览30
暂无评分
摘要
Dimension attributes in data warehouses are typically hierarchical (e.g., geographic locations in sales data, URLs in Web traffic logs). OLAP tools are used to summarize the measure attributes (e.g., total sales) along a dimension hierarchy, and to characterize changes (e.g., trends and anomalies) in a hierarchical summary over time. When the number of changes identified is large (e.g., total sales in many stores differed from their expected values), a parsimonious explanation of the most significant changes is desirable. In this paper, we propose a natural model of parsimonious explanation, as a composition of node weights along the root-to-leaf paths in a dimension hierarchy, which permits changes to be aggregated with maximal generalization along the dimension hierarchy. We formalize this model of explaining changes in hierarchical summaries and investigate the problem of identifying optimally parsimonious explanations on arbitrary rooted one dimensional tree hierarchies. We show that such explanations can be computed efficiently in time essentially proportional to the number of leaves and the depth of the hierarchy. Further, our method can produce parsimonious explanations from the output of any statistical model that provides predictions and confidence intervals, making it widely applicable. Our experiments use real data sets to demonstrate the utility and robustness of our proposed model for explaining significant changes, as well as its superior parsimony compared to alternatives.
更多
查看译文
关键词
OLAP,hierarchical summary,change,parsimonious explanations,statistical model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要