Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21249
Title: DYNAMIC HAND GESTURE RECOGNITION FRAMEWORK FOR HUMAN-COMPUTER INTERACTION
Authors: TRIPATHI, REENA
Keywords: DYNAMIC HAND GESTURE RECOGNITION
HUMAN-COMPUTER INTERACTION
CLIP MODEL
DDA LOSS
HCI
Issue Date: Nov-2024
Series/Report no.: TD-7635;
Abstract: To communicate with one another hand gesture is very important. The task of using the hand gesture in technology is influenced by a very common way humans communicate in the natural environment. In the early days of interaction with a com puter, the user uses a keyboard, mouse, pen. Similar type of communication can be possible using hand gesture that replaces the hardware devices and reduces the cost of hardware. Due to the advancement of technologies and the digital era the need of human-computer interaction(HCI) techniques needs to grow. Hand gesture recogni tion is a one of the possible way that makes human interaction with the computers. Hand gestures have numerous applications in daily life, ranging from controlling au tomatic vehicles to enhancing smart home development and human-robot interaction. They are used in clinical operations where surgeons can handles MRI or X-ray scans through hand gestures. In sign language recognition, hand gestures enable commu nication among the deaf community. In robotics, dynamic hand gestures control robot movements, 3D hand gesture recognition facilitates real-time human-computer interaction. Hand gestures also play a crucial role in home automation, controlling appliances like lights, fans, and security systems. For computers and tablets, gestures are used to drag, drop, and move files, improving human-computer interaction. The recognizing and finding gesturing hand comes under the area of hand gesture analysis. To find out the gesturing hand is very difficult than finding the another part of the human body because the smaller size of the hand. The hand has greater complexity and more challenges due to various factor such as hand occlusion, background clut ter, lighting illumination and inter and intra-class variation. These factors affect the accuracy of dynamic hand gesture. Real-time recognition of dynamic hand gesture is difficult because the algorithm can’t determine with accuracy where a gesture starts and ends in a video feed. Dynamic hand gesture recognition (DHGR), which involves understanding ges tures in motion over time, poses various challenges. These include variations in light ing, occlusions, complex backgrounds, and similarities between gestures, within the vii same category (Intra-class) and across different categories (Inter-class), making detec tion and recognition difficult. Traditional models often find it challenging to address these issues, particularly when working with a small or limited dataset. Further, inte grating dual-modality and multi-modality where RGB data, skeletal data, and depth information integrated in the model makes more challenging. The afford mentioned challenges motivated us to work in the field of dynamic hand gesture recognition addressing the various research gaps such as solving the problem of hand detection and tracking, inter and intra-class variation, hand occlusion and efficient and genic framework. This thesis aims to develop efficient models that handle these issues, work well with limited data, and perform reliably under diverse conditions. Initially, In first framework, we solve the challenge of hand detection and tracking where RGB videos are used to extract the features using CLIP model. The CLIP-BLSTM model is specif ically designed to address challenges associated with small hand sizes and changing lighting conditions, proving to be efficient with fewer training samples and parame ters. Overall, it performs effectively in different lighting environments, establishing it as an accurate hand gesture recognition system. Further, extraction of skeleton data from RGB data and use of skeleton data in the proposed models overcome the challenges of background clutter and gesturing hand tracking. Same gesture may perform differently by different persons arises the concern on inter-class and intra class variation problem. To tackle inter-class and intra-class variation, DDA Loss is employed to enhance within-class similarity gesture and reduce the between-class similarity gesture. In this work, skeleton data is used to create skeleton point tra jectories, and DDA loss is used to enhance the feature learning so that intra-class similarity increases and inter-class similarity decreases. In the literature we analyze that use of multiple modalities compare to the single modality performs well on the deep learning models and boost the performance. Thus, we also work on dual-modality and multiple-modality. In the next model we combined skeletal data with RGB data to recognize the dynamic hand gesture. Our proposed model offers a bidirectional gated recurrent unit (Bi-GRU) model based viii hand gesture recognition system that is computationally effective than Bi-LSTM. This method is designed to attain high-speed performance while being capable of working successfully even with limited training samples. This dual-feature extraction method allows the model to achieve a more robust understanding of hand gestures, improving overall performance in diverse environment. However, limited research has been conducted on multi-modal fusion as combination of multiple modalities can boost the performance. In the next work we developed a hybrid framework that integrates RGB, depth, and skeleton data to create an efficient system for dynamic hand gesture recognition. An extensive experimental study conducted on various standard datasets such as SKIG, DHG14/28, NWUHG, FPHA, LISA, 26-Gestures, NTU, NTU120, and CHG illustrates the effectiveness of our all proposed frameworks.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21249
Appears in Collections:Ph.D. Information Technology

Files in This Item:
File Description SizeFormat 
REENA TRIPATHI pH.d..pdf16.39 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.