When to Re-Draw a Sample, and Why.

IEEE BigData(2021)

引用 1|浏览2
暂无评分
摘要
One way to estimate a statistic over a large data set is to draw a sample consisting of some records from the data set, and compute the statistic over the sample as an estimate of the statistic over the data set. This procedure may fail to produce an accurate estimate. Using one sample for multiple statistics reduces computation and latency, but it can increase the probability of multiple failures to produce accurate estimates, because estimates based on the same sample may not have independent failure probabilities. We show how to bound the probability of multiple failures for sequences of estimates over one or more samples.
更多
查看译文
关键词
sampling,mean estimation,nearly uniform validation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要