High-Performance Tensor Decoder on GPUs for Wireless Camera Networks in IoT.

Hai Li,Tao Zhang, Runjia Zhang,Xiao-Yang Liu

HPCC/SmartCity/DSS（2019）

引用 6|浏览21

暂无评分

摘要

With the rapid development of the Internet of Things, tensor-based coding and decoding algorithms are widely used in wireless camera networks. Recently, a novel video decoder based on the low-tubal-rank tensor model has been proposed, which achieves better quality of services than conventional schemes. However, the tensor decoding algorithm is compute-intensive, rendering it impractical for real-time applications. In this paper, we propose effective strategies to accelerate the tensor decoder on GPUs (Graphics Processing Units). We implement the tensor decoding algorithm on the GPU architecture, and propose optimization strategies to eliminate data reorganizing overhead, provide batched complex matrix computations, and reduce memory consumption as well as computation overhead. With real video data, the GPU algorithm achieves an average of 237.44× and up to 312.39× speedups on a Tesla V100 GPU versus the CPU algorithm on a multi-core CPU, while preserving similar recovery errors. The recovered video by this high performance GPU tensor decoder shows good visual effects.

查看译文

关键词

GPU,tensor decoder,low-tubal-rank tensor model,alternating minimization algorithm

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要