Deep Learning-Based Sign Language Recognition Using Efficient Multi-Feature Attention Mechanism

Yenisari, Esma; Yavuz, Sirma

Deep Learning-Based Sign Language Recognition Using Efficient Multi-Feature Attention Mechanism

Tarih

2025

Yazarlar

Yenisari, Esma

Yavuz, Sirma

Yayıncı

Ieee-Inst Electrical Electronics Engineers Inc

Erişim Hakkı

info:eu-repo/semantics/openAccess

Özet

Sign language is a communication system used by Deaf and hard of hearing people and serves as a bridge between Deaf and hearing communities. Since sign language uses numerous visuomotor elements that include both visual perception (hand shapes, facial expressions) and physical movements (hand and arm movements), it represents a multimodal input source for Sign Language Recognition (SLR) systems. In this study, a novel deep learning-based architecture using EfficientNet and multi-feature attention mechanism is proposed to accurately recognize SL signs. Initially, general visual features are acquired through the EfficientNet model, leveraging the transfer learning paradigm. Subsequently, dataset-specific contextual features are extracted utilizing distinct network types; spatial dependencies are modeled via Convolutional Neural Networks (CNNs), whereas temporal dynamics are learned through Recurrent Neural Networks (RNNs). These features are adaptively weighted using an attention mechanism and focus on the most critical information for the classification task. This approach ensures that the most information-rich and useful components of both methods are emphasized, leading to a significant increase in final performance. Utilizing RGB video images, the proposed model, on the BosphorusSign22k General dataset comprising Turkish Sign Language (TSL) signs, achieved accuracies of 99.01% and 96.84% for sign classes of 50 and 174, respectively. Furthermore, the generalization ability of the model was demonstrated by its high accuracy of 99.84% in the Argentinian Sign Language dataset (LSA64) and 98.41% in the Indian Sign Language dataset (INCLUDE50). Experimental results indicated that the proposed model architecture has a competitive performance compared to existing SLR models reviewed in the literature.

Anahtar Kelimeler

Sign language, Hands, Systematic literature review, Feature extraction, Attention mechanisms, Sensors, Deep learning, Cameras, Accuracy, Deafness, Attention mechanism, computer vision, deep learning, sign language recognition, SLR datasets, vision-based recognition

Kaynak

Ieee Access

WoS Q Değeri

Q2

Scopus Q Değeri

Q1

Cilt

13

Bağlantı

https://doi.org/10.1109/ACCESS.2025.3586096
https://hdl.handle.net/20.500.12428/34691

Koleksiyon

WoS İndeksli Yayınlar Koleksiyonu
Scopus İndeksli Yayınlar Koleksiyonu

Detaylı Öğe Kaydı

Deep Learning-Based Sign Language Recognition Using Efficient Multi-Feature Attention Mechanism

Tarih

Yazarlar

Dergi Başlığı

Dergi ISSN

Cilt Başlığı

Yayıncı

Erişim Hakkı

Özet

Açıklama

Anahtar Kelimeler

Kaynak

WoS Q Değeri

Scopus Q Değeri

Cilt

Sayı

Künye

Bağlantı

Koleksiyon