SIGN LANGUAGE DETECTION USING DEEP LEARNING

SINGH, MANISHA

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18096

Full metadata record

DC Field	Value	Language
dc.contributor.author	SINGH, MANISHA	-
dc.date.accessioned	2020-12-28T06:24:10Z	-
dc.date.available	2020-12-28T06:24:10Z	-
dc.date.issued	2020-06	-
dc.identifier.uri	http://dspace.dtu.ac.in:8080/jspui/handle/repository/18096	-
dc.description.abstract	Sign language is a language which allows dumb people to communicate. People with this disability use different modes to talk with others, there are number of methods available for their communication one such common method of communication is sign language. But the challenge is to understand the sign language who is not aware of it. Then it is hard for mute people to communicate with them. Developing sign language application for deaf people is very important, as they’ll be able to communicate easily with even those who don’t understand sign language. Sign Language Recognizer took the basic step in bridging the communication gap between normal people, deaf and dumb people using sign language. The focus of this work is to create the best vision-based system to identify sign language gestures from the video sequences. The reason for choosing a system based on vision relates to the fact that it provides a simpler and more intuitive way of communication between a human and a computer. In this report, 46 different gestures have been considered. Video sequences contain both the temporal as well as the spatial features. Using of two different models to train both the temporal as well as the spatial features. To train the model on the spatial features of the video sequences Inception model [14] is used which is a deep CNN (convolutional neural net). CNN was trained on the frames obtained from the video sequences of train data. RNN (recurrent neural network) is used to train the model on the temporal features. Trained CNN model was used to make predictions for individual frames to obtain a sequence of predictions or pool layer outputs for each video. Now this sequence of prediction or pool layer outputs was given to RNN to train on the temporal features. The data set [7] used consists of Argentinian Sign Language (LSA) Gestures, with around 2300 videos belonging to 46 gestures categories. Using the predictions by CNN as input for RNN 93.3% accuracy was obtained and by using pool layer output as input for RNN an accuracy of 95.217% was obtained.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	TD-4957;	-
dc.subject	SIGN LANGUAGE	en_US
dc.subject	DEEP LEARNING	en_US
dc.subject	CNN MODEL	en_US
dc.title	SIGN LANGUAGE DETECTION USING DEEP LEARNING	en_US
dc.type	Thesis	en_US
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
M.TECH. MANISHA SINGH.pdf		1.66 MB	Adobe PDF	View/Open

Show simple item record