Vision Transformer-based Real-Time Camouflaged Object Detection System at Edge.

SMARTCOMP(2023)

引用 0|浏览9
暂无评分
摘要
Camouflaged object detection is a challenging task in computer vision that involves identifying objects that are intentionally or unintentionally hidden in their surrounding environment. Vision Transformer mechanisms play a critical role in improving the performance of deep learning models by focusing on the most relevant features that help object detection under camouflaged conditions. In this paper, we utilized a vision transformer (VT) in two phases, a) By integrating VT with a deep learning architecture for efficient monocular depth map generation for camouflaged objects and b) By embedding VT multiclass object detection model with multimodal feature input (RGB with RGB-D) that increases the visual cues and provides more representational information to the model for performance enhancement. Additionally, we performed an ablation study to understand the role of the vision transformer in camouflaged object detection and incorporated GRAD-CAM on top of the model to visualize the performance improvement achieved by embedding the VT in the model architecture. We deployed the model on resource-constrained edge devices for real-time object detection to realistically test the performance of the trained model.
更多
查看译文
关键词
Camouflaged Object Detection, Multi-Modality, Vision Transformer, GRAD-CAM
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要