谷歌浏览器插件
订阅小程序
在清言上使用

Distributed compression and decompression for big image data: LZW and Huffman coding

Rohan Kishor Netalkar,Hillol Barman, Rushik Subba, Kandula Venkata Preetam,Undi Surya Narayana Raju

JOURNAL OF ELECTRONIC IMAGING(2021)

引用 1|浏览0
暂无评分
摘要
In today's era, digital data are being created and transmitted primarily in the form of images and videos. Storing such a huge number of images and transmitting them requires a lot of computer resources such as storage and bandwidth. So, instead of storing the image data as is, if we compress and store it, it saves a lot of resources. Image compression is the act of removing the maximum possible redundant data from an image and maintaining only the non-redundant data. To compress and decompress such big image data, a distributed environment with a map-reduce paradigm using Hadoop distributed file system and Apache Spark is used. In addition to these, Microsoft Azure cloud environment with infrastructure as a service is also used. Various setups such as a single system, 1 + 4 node cluster, 1 + 15 node cluster, and 1 + 18 node cluster cloud infrastructure are used to show the time comparisons among these setups with the self-created large image dataset. On these four self-made clusters, more than 100 million (109,670,400) images are compressed and decompressed; the execution times are compared with two of the traditional image compression methods: Lempel-Ziv-Welch (LZW) and Huffman coding. Both the LZW and Huffman coding are lossless image compression techniques. LZW removes both spatial and coding redundancies and whereas the Huffman coding removes only coding redundancy. These two compression techniques: LZW and Huffman are just placeholders, these can be replaced with any other compression technique for large image data. In our work, we have used compression ratio, average root mean square error (ARMSE), and average peak signal to noise ratios to validate that the compression and decompression process for each technique is exactly the same irrespective of the number of systems used, distributed or not. (C) 2021 SPIE and IS&T
更多
查看译文
关键词
distributed system, SPARK, compression, decompression, Lempel-Ziv-Welch, Huffman coding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要