Estimating mutual information on data streams

International Conference on Scientific and Statistical DB Management(2015)

引用 25|浏览55
暂无评分
摘要
Mutual information is a well-established and broadly used concept in information theory. It allows to quantify the mutual dependence between two variables -- an essential task in data analysis. For static data, a broad range of techniques addresses the problem of estimating mutual information. However, the assumption of static data is not applicable for today's dynamic data sources such as data streams: In contrast to static approaches, an online estimator must be able to deal with the evolving, changing, and infinite nature of the stream. Furthermore, some tasks require the estimation to be available online while processing the raw data stream. Our proposed solution Mise (Mutual Information Stream Estimation) allows a user to issue mutual information queries in arbitrary time windows. As a key feature, we introduce a novel sampling scheme, which ensures an equal treatment of queries over multiple time scales, e.g., ranging from milliseconds up to decades. We thoroughly analyze the requirements of such a multiscale sampling scheme, and evaluate the resulting quality of Mise in a broad range of experiments.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要