Sensor-based Bangla Sign Language Recognition: Comparative Analysis of Data Augmentation Methods
Authors
Md. Rakibul Islam
(Computer Science and Engineering)
Abstract
Bangla Sign Language (BdSL) plays a vital role in communication for people with speech and hearing disorders in Bangladesh. Prior systems (both vision and sensor-based) suffer from different noises, limited datasets, and high computational requirements. This study proposed a new dynamic dataset that contains 11,264 samples (with 22 distinct classes), 440 temporal features per sample, and a fixed window to capture dynamic hand gestures. The dataset was collected using flex sensors, gyroscopes, and accelerometers integrated into a flexible hand glove. To overcome dataset size limitations and sensor noise, the proposed method also investigated how data augmentation can improve sensor-based BdSL recognition and comparatively analyzed the efficiency of 7 augmentation schemes, including classic (SMOTE, ADASYN, BorderlineSMOTE) to advanced generation methods (WGAN-GP), as well as noise-based and mix-up scenarios. Among them, the ADASYN augmentation with ensemble voting yields an accuracy of 92.59%, while the more complex WGAN-GP yields 91.97%. For recognition, a systematic comparison was conducted among the traditional six machine learning (ML) models, the five deep learning (DL) models, and the ensemble method on the augmented dataset (ADASYN). A new Feature Fusion Network (FFN) architecture for BdSL achieves an accuracy of 91.39%. The experiment demonstrated the feasibility of real-time mobile deployment: k-NN (88.73% accuracy, 2.3 ms inference) and Simple LSTM (92.10% accuracy, 8.4 ms inference), offering k−NN as the most practical lightweight model for BdSL users.