Pyramid Transformer: A Multi-size Object Detection Model with Limited Device Requirements for the Nursing Robot.

Jiazheng Li,Jiexin Xie,Jiaxin Wang, Yujian Wen,Shijie Guo

ICTAI(2022)

引用 0|浏览0
暂无评分
摘要
Multi-size object detection is a technical difficulty which impeding the development of the intelligent nursing robot. To cope with the problem, this paper proposes a Pyramid Transformer model to detect the objects with different sizes in nursing scenario. Pyramid Transformer consists of three parts including Transformer Module, Pyramid Structure and Convolution Module. Transformer Module can improve the performance of large object detection with Multi-head Attention mechanism, and Pyramid Structure enables the model to make prediction with feature maps of different sizes which benefits the detection of small objects. Convolution Module is employed to reduce hardware requirements, and it makes Pyramid Transformer could run and implement on a single graphics card. The experiments show that the mean average precision reaches 72.7% which makes improvement over other models. This shows that the proposed Pyramid Transformer model is practical and effective for object detection of the nursing robot. The dataset can be got at https://github.com/NotFar1997/NSI-dataset.
更多
查看译文
关键词
Nursing robot, Multi-size object detection, Deep learning, Computer vision
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要