Deep Learning-Based Sign Language Recognition Using Efficient Multi-Feature Attention Mechanism

dc.authorid0000-0001-6757-5990
dc.contributor.authorYenisari, Esma
dc.contributor.authorYavuz, Sirma
dc.date.accessioned2026-02-03T12:00:44Z
dc.date.available2026-02-03T12:00:44Z
dc.date.issued2025
dc.departmentÇanakkale Onsekiz Mart Üniversitesi
dc.description.abstractSign language is a communication system used by Deaf and hard of hearing people and serves as a bridge between Deaf and hearing communities. Since sign language uses numerous visuomotor elements that include both visual perception (hand shapes, facial expressions) and physical movements (hand and arm movements), it represents a multimodal input source for Sign Language Recognition (SLR) systems. In this study, a novel deep learning-based architecture using EfficientNet and multi-feature attention mechanism is proposed to accurately recognize SL signs. Initially, general visual features are acquired through the EfficientNet model, leveraging the transfer learning paradigm. Subsequently, dataset-specific contextual features are extracted utilizing distinct network types; spatial dependencies are modeled via Convolutional Neural Networks (CNNs), whereas temporal dynamics are learned through Recurrent Neural Networks (RNNs). These features are adaptively weighted using an attention mechanism and focus on the most critical information for the classification task. This approach ensures that the most information-rich and useful components of both methods are emphasized, leading to a significant increase in final performance. Utilizing RGB video images, the proposed model, on the BosphorusSign22k General dataset comprising Turkish Sign Language (TSL) signs, achieved accuracies of 99.01% and 96.84% for sign classes of 50 and 174, respectively. Furthermore, the generalization ability of the model was demonstrated by its high accuracy of 99.84% in the Argentinian Sign Language dataset (LSA64) and 98.41% in the Indian Sign Language dataset (INCLUDE50). Experimental results indicated that the proposed model architecture has a competitive performance compared to existing SLR models reviewed in the literature.
dc.description.sponsorshipScientific and Technological Research Council of Turkiye (TUBITAK) [125E318]
dc.description.sponsorshipThis work was supported by the Scientific and Technological Research Council of Turkiye (TUBITAK) under Project 125E318.
dc.identifier.doi10.1109/ACCESS.2025.3586096
dc.identifier.endpage126699
dc.identifier.issn2169-3536
dc.identifier.scopus2-s2.0-105009969779
dc.identifier.scopusqualityQ1
dc.identifier.startpage126684
dc.identifier.urihttps://doi.org/10.1109/ACCESS.2025.3586096
dc.identifier.urihttps://hdl.handle.net/20.500.12428/34691
dc.identifier.volume13
dc.identifier.wosWOS:001534536400047
dc.identifier.wosqualityQ2
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherIeee-Inst Electrical Electronics Engineers Inc
dc.relation.ispartofIeee Access
dc.relation.publicationcategoryMakale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/openAccess
dc.snmzKA_WOS_20260130
dc.subjectSign language
dc.subjectHands
dc.subjectSystematic literature review
dc.subjectFeature extraction
dc.subjectAttention mechanisms
dc.subjectSensors
dc.subjectDeep learning
dc.subjectCameras
dc.subjectAccuracy
dc.subjectDeafness
dc.subjectAttention mechanism
dc.subjectcomputer vision
dc.subjectdeep learning
dc.subjectsign language recognition
dc.subjectSLR datasets
dc.subjectvision-based recognition
dc.titleDeep Learning-Based Sign Language Recognition Using Efficient Multi-Feature Attention Mechanism
dc.typeArticle

Dosyalar