Enhancing DETR with Attention-Based Thresholding for Efficient Early Japanese Book Reorganization

2023 International Conference on Advanced Mechatronic Systems (ICAMechS)(2023)

引用 0|浏览0
暂无评分
摘要
Analysis of early Japanese books is an essential clue for researching the history and culture of the period, however, the books are written in pre-modern Japanese called Kuzushiji, which is difficult to read unless one is an expert. Kuzushiji by OCR has been popular in recent years, although recognition of Kuzushiji by end-to-end object detection is difficult with conventional CNN-based models such as YOLO. This paper uses the Transformer-based object detection model DETR, which outperforms CNNs in the field of object detection and improves DETR performance using threshold processing. The threshold processing for Self-Attention can reduce redundant dependencies between each patch and accelerate DETR training. In this experiment, DN-DAB-DETR-R50 with thresholding achieved +2.0%F 1 in total and up to +7.3%F 1 per book compared to the vanilla model and prove the efficiency of the thresholding. However, threshold processing increases computational costs slightly, and future work is improving the problem by patch pooling according to attention weights using its sparsity with thresholding.
更多
查看译文
关键词
DETR,Transformer,Early Japanese books,Object Detection
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要