基于轻量3D CNNs和Transformer的手语识别

Journal of Huazhong University of Science and Technology(Nature Science Edition)(2023)

Cited 0|Views1
No score
Abstract
针对传统基于3D CNNs(三维卷积神经网络)的手语识别方法模型计算复杂度和内存占用较高,及基于RNNs(循环神经网络)的连续手语识别方法的长距离建模能力不足的问题,提出一种基于轻量3D CNNs和Transformer的手语识别方法.首先使用轻量3D CNNs进行孤立词手语识别的时空建模,然后提出RKD(随机知识蒸馏),从多个教师模型中提取知识以提高轻量三维卷积的特征提取能力;针对连续手语,在特征提取后使用完全基于自注意力的Transformer进行全局建模.实验结果表明:所提方法在CSL-500和CSL-continuous数据集上可以获得95.10%的识别率和1.9的WER(词错误率),证明了所提方法的有效性.
More
Translated text
Key words
sign language recognition,lightweight 3D convolutional neural networks,knowledge distillation,Transformer network,feature extraction
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined