Online Spatio-temporal 3D Convolutional Neural Network for Early Recognition of Handwritten Gestures

DOCUMENT ANALYSIS AND RECOGNITION - ICDAR 2021, PT I(2021)

引用 3|浏览2
暂无评分
摘要
Inspired by recent spatio-temporal Convolutional Neural Networks in computer vision field, we propose OLT-C3D (Online Long-Term Convolutional 3D), a new architecture based on a 3D Convolutional Neural Network (3D CNN) to address the complex task of early recognition of 2D handwritten gestures in real time. The input signal of the gesture is translated into an image sequence along time with the trajectory history. The image sequence is passed into our 3D CNN OLT-C3D which gives a prediction at each new frame. OLT-C3D is coupled with an integrated temporal reject system to postpone the decision in time if more information is needed. Moreover our system is end-to-end trainable, OLT-C3D and the temporal reject system are jointly trained to optimize the earliness of the decision. Our approach achieves superior performances on two complementary and freely available datasets: ILGDB and MTGSetB.
更多
查看译文
关键词
Spatio-temporal convolutional neural network, Early recognition, Handwritten gesture, Online long-term C3D, WaveNet 3D
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要