Triplet Loss-based Convolutional Neural Network for Static Sign Language Recognition

2022 Innovations in Intelligent Systems and Applications Conference (ASYU)(2022)

引用 1|浏览1
暂无评分
摘要
Sign language (SL) is a non-verbal visual language used as a primary communication tool by deaf or hearing-impaired community. Owing to availability of large number of SLs with wide varieties, a great effort is required for public majority to master in interpreting them which is not feasible. Despite the recent advances in developing automatic sign language recognition (SLR) systems, their performance undergoes tremendous degradation when low resolution images with large intra-class and slight inter-class variations are employed. To deal with these issues, a novel end-to-end Convolutional Neural Network (CNN) is proposed to extract the features from the low resolution input images. This feature extractor is trained based on the semi-hard triplet loss function so that the images belonging to the same class are placed close to one another in a lower dimensional embedding space while the distance between the samples from separate classes is maximized. In addition to the efficient loss function, proper selection of the filter and kernel sizes, activation functions, and regularization methods in the proposed CNN leads to effective feature vectors from the small-sized images while the number of the parameters is reduced. The embedded features with a fixed small vector length are utilized to train a Support Vector Machine (SVM) classifier for final recognition. Experimental results on two datasets from two SLs of American (MNIST) and Arabic (ArSL2018) with an accuracy of 100% and 97.54%, respectively, demonstrate that the proposed model outperforms the existing approaches without any need for increasing the quantity of the dataset with augmentation which proves its feasibility.
更多
查看译文
关键词
static sign language recognition,semi-hard triplet loss,CNN,SVM,feature embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要