Welcome to the Bahria University DSpace digital repository. DSpace is a digital service that collects, preserves, and distributes digital material. Repositories are important tools for preserving an organization's legacy; they facilitate digital preservation and scholarly communication.
dc.contributor.author | M. Aayan Khattak, 01-132202-001 | |
dc.contributor.author | Ahmed Raza Kalair, 01-132202-046 | |
dc.contributor.author | Talha Iqbal Butt, 01-132202-052 | |
dc.date.accessioned | 2024-10-24T08:14:19Z | |
dc.date.available | 2024-10-24T08:14:19Z | |
dc.date.issued | 2024 | |
dc.identifier.uri | http://hdl.handle.net/123456789/18213 | |
dc.description | Supervised by Dr. Shehzad Khalid | en_US |
dc.description.abstract | This thesis examines the Sign language translator, one of the most useful computer vision and action recognition applications. Previous sign language translators often do not offer action gesture translation, lack stability, and only translate letters or words, resulting in slow communication. This application aims to equip special people of our society with a powerful tool, so they do not feel they are missing out on anything. Upon our research, we found that many sign language interpreters have good accuracy, but many were word-based or letter-based interpreters. Therefore, the need evolved to create a sign language interpreter that not only interprets signs limited to just hand gestures and positioning but also involves a dynamic approach to include body movement. Our two-sign language-based interpreters, static and dynamic, are based on neural networks such as CNN and LSTM respectively. In the static sign language-based interpreter, we use CNN to translate different signs into words and then send the words into GPT API to create meaningful sentences by prompt. Dynamic, on the other hand, works using the MediaPipe holistic function, which is used to draw body landmarks and then detect actions done by the user using temporal modeling and long short-term memory cells. Results showed that our trained model was able to provide 94dataset and 90.8% accuracy on the dynamic dataset. The result of our pre-trained model for the static dataset was then fed into the GPT transformer by an API call, which provided meaningful sentences as output. Finally, we concluded that by using the neural net, we will be able to train our model on more words as we go along and can generate more meaningful sentences, as well as train more actions on the dynamic temporal model. Additionally, Sign2Text will be made readily available online, anywhere and anytime, for fluent communication. | en_US |
dc.language.iso | en | en_US |
dc.publisher | Computer Engineering, Bahria University Engineering School Islamabad | en_US |
dc.relation.ispartofseries | BCE;P-2817 | |
dc.subject | Computer Engineering | en_US |
dc.subject | Challenges and Limitations | en_US |
dc.subject | Confusion Matrix of the Dual Modes | en_US |
dc.title | Sign2text: Real-time Sign Language Translator Using Deep Learning | en_US |
dc.type | Project Reports | en_US |