NCOD: Near-Optimum Video Compression for Object Detection.

Ardavan Elahi, Ali Falahati,Farhad Pakdaman,Mehdi Modarressi,Moncef Gabbouj

ISCAS(2023)

Cited 0|Views5
No score
Abstract
With the emergence of technologies like smart cities, Internet of things (IoT), and 5G, the amount of produced visual data at the edges and remote nodes has exploded. Since for a considerable portion of the captured video the target is a machine learning task, rather than a human audience, transmission of videos in such applications requires efficient video compression tailored for machine vision. However, existing compression solutions are optimized for human vision. This paper presents a methodology to optimize an existing video compression standard, HEVC, for a machine vision task, Object Detection (OD). To this end, (1) a dataset of compressed videos, including several compression-ratios and their corresponding OD performance is collected to enable modeling, (2) A trade-off point (knee-point) between bitrate and OD performance is defined, that finds the point after which no major improvements will be achieved, (3) a set of features were extracted and studied to model this point, via a practical machine learning method. The resulting solution can predict the knee-point with MAE=1.28, resulting in a.Recall of only 0.012 and bitrate reduction of 86.56%, compared to OD with very high-quality video.
More
Translated text
Key words
Video coding,Video coding for machine (VCM),CRF,Object Detection,JND
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined