Reducing Data Loading Bottleneck with Coarse Feature Vectors for Large Scale Learning.

BIGMINE'14: Proceedings of the 3rd International Conference on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications - Volume 36(2014)

引用 1|浏览44
暂无评分
摘要
In large scale learning, disk I/O for data loading is often the runtime bottleneck. We propose a lossy data compression scheme with a fast decompression to reduce disk I/O, allocating fewer than the standard 32 bits for each real value in the data set. We theoretically show that the estimation error induced by the loss in compression decreases exponentially with the number of the bits used per value. Our experiments show the proposed method achieves excellent performance with a small number of bits and substantial speedups during training.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要