Light Field Salient Object Detection With Sparse Views via Complementary and Discriminative Interaction Network

IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY(2024)

引用 0|浏览13
暂无评分
摘要
4D light field data record the scene from multiple views, thus implicitly providing beneficial depth cue for salient object detection in challenging scenes. Existing light field salient object detection (LF SOD) methods usually use a large number of views to improve the detection accuracy. However, using so many views for LF SOD brings difficulties to its practical applications. Considering that adjacent views in a light field are actually with very similar contents, in this work, we propose defining a more efficient pattern of input views, i. e., key sparse views, and design a network to effectively explore the depth cue from sparse views for LF SOD. Specifically, we firstly introduce a low rank-based statistical analysis to the existing LF SOD datasets, which allows us to conclude a fixed yet universal pattern for our key sparse views, including the number and positions of views. These views maintain the sufficient depth cue, but greatly lower the number of views to be captured and processed, facilitating practical applications. Then, we propose an effective solution with a key Complementary and Discriminative Interaction Module (CDIM) for LF SOD from key sparse views, named CDINet. The CDINet follows a two-stream structure to extract the depth cue from the light field stream (i. e., sparse views) and the appearance cue from the RGB stream (i. e., center view), generating features and initial saliency maps for each stream. The CDIM is tailored for inter-stream interaction of both these features and saliency maps, using the depth cue to complement the missing salient regions in RGB stream and discriminate the background distraction, to enhance the final saliency map further. Extensive experiments on three LF multi-view datasets demonstrate that our CDINet not only outperforms the state-of-the-art 2D methods, but also achieves competitive performance as compared with the state-of-the-art 3D and 4D methods. The code and results of our method are available at https://github.com/GilbertRC/LFSOD-CDINet.
更多
查看译文
关键词
Light fields,Feature extraction,Arrays,Object detection,Cameras,Streaming media,Three-dimensional displays,Light field,salient object detection,sparse views,complementary and discriminative interaction
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要