Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18096
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSINGH, MANISHA-
dc.date.accessioned2020-12-28T06:24:10Z-
dc.date.available2020-12-28T06:24:10Z-
dc.date.issued2020-06-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/18096-
dc.description.abstractSign language is a language which allows dumb people to communicate. People with this disability use different modes to talk with others, there are number of methods available for their communication one such common method of communication is sign language. But the challenge is to understand the sign language who is not aware of it. Then it is hard for mute people to communicate with them. Developing sign language application for deaf people is very important, as they’ll be able to communicate easily with even those who don’t understand sign language. Sign Language Recognizer took the basic step in bridging the communication gap between normal people, deaf and dumb people using sign language. The focus of this work is to create the best vision-based system to identify sign language gestures from the video sequences. The reason for choosing a system based on vision relates to the fact that it provides a simpler and more intuitive way of communication between a human and a computer. In this report, 46 different gestures have been considered. Video sequences contain both the temporal as well as the spatial features. Using of two different models to train both the temporal as well as the spatial features. To train the model on the spatial features of the video sequences Inception model [14] is used which is a deep CNN (convolutional neural net). CNN was trained on the frames obtained from the video sequences of train data. RNN (recurrent neural network) is used to train the model on the temporal features. Trained CNN model was used to make predictions for individual frames to obtain a sequence of predictions or pool layer outputs for each video. Now this sequence of prediction or pool layer outputs was given to RNN to train on the temporal features. The data set [7] used consists of Argentinian Sign Language (LSA) Gestures, with around 2300 videos belonging to 46 gestures categories. Using the predictions by CNN as input for RNN 93.3% accuracy was obtained and by using pool layer output as input for RNN an accuracy of 95.217% was obtained.en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTD-4957;-
dc.subjectSIGN LANGUAGEen_US
dc.subjectDEEP LEARNINGen_US
dc.subjectCNN MODELen_US
dc.titleSIGN LANGUAGE DETECTION USING DEEP LEARNINGen_US
dc.typeThesisen_US
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
M.TECH. MANISHA SINGH.pdf1.66 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.