SignNet II: A Transformer-Based Two-Way Sign Language Translation Model

Lipisha Chaudhary,Tejaswini Ananthanarayana,Enjamamul Hoq,Ifeoma Nwogu

IEEE transactions on pattern analysis and machine intelligence（2023）

引用 7|浏览27

暂无评分

摘要

The role of a sign interpreting agent is to bridge the communication gap between the hearing-only and Deaf or Hard of Hearing communities by translating both from sign language to text and from text to sign language. Until now, much of the AI work in automated sign language processing has focused primarily on sign language to text translation, which puts the advantage mainly on the side of hearing individuals. In this article, we describe advances in sign language processing based on transformer networks. Specifically, we introduce SignNet II, a sign language processing architecture, a promising step towards facilitating two-way sign language communication. It is comprised of sign-to-text and text-to-sign networks jointly trained using a dual learning mechanism. Furthermore, by exploiting the notion of sign similarity, a metric embedding learning process is introduced to enhance the text-to-sign translation performance. Using a bank of multi-feature transformers, we analyzed several input feature representations and discovered that keypoint-based pose features consistently performed well, irrespective of the quality of the input videos. We demonstrated that the two jointly trained networks outperformed their singly-trained counterparts, showing noteworthy enhancements in BLEU-1 - BLEU-4 scores when tested on the largest available German Sign Language (GSL) benchmark dataset.

查看译文

关键词

Assistive technologies, Gesture recognition, Transformers, Decoding, Three-dimensional displays, Feature extraction, Videos, Sign language translations, dual learning, transformer model, metric embedded learning

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要