Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20642
Full metadata record
DC FieldValueLanguage
dc.contributor.authorKUMAR, RAHUL-
dc.date.accessioned2024-08-05T08:16:56Z-
dc.date.available2024-08-05T08:16:56Z-
dc.date.issued2023-11-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/20642-
dc.description.abstractHuman Action Recognition research area is gaining interest due to its wide range of applications in the fields of elderly monitoring, suspicious activity monitoring, sports activity, pose estimation and health monitoring etc. The presence of a wide range of variations in normal human activities adds complexity to the recognition process. The use of automated systems is crucial in facilitating the increased utilization of cameras. These systems play a vital role in categorizing actions via the application of automated systems, namely Machine and Deep Learning. One of the primary objectives within the field of artificial intelligence is to create an automated system capable of effectively recognizing and comprehending human conduct and activities shown in video sequences. Over the last decade, several efforts have been undertaken to identify and acknowledge human activity inside visual sequences. However, this remains a formidable undertaking owing to factors such as the similarity of actions within the same class, occlusions, differences in viewpoint, and ambient circumstances. Due to the various issues and research gaps that were discovered in vision-based action recognition during the literature review phase, different state-of-the-reference methods are reviewed with their methodologies. The researcher used multiple approaches, like handcrafted-based feature extraction and automated feature extraction approaches with deep architecture. In this thesis, the suggested methods are categorized based on the modality, feature extraction techniques, and classification approaches. This work describes the various datasets along with their specifications and limitations. The latest approaches functionality is represented with their performance parameters. This thesis includes different methods that use ML- and DL-based techniques, along with their accuracy. Second, a Multi modal Deep Learning method is proposed for a Multiview dataset to deal with the Multiview problem. RGB, depth, and Skelton data are used in a multimodal base approach. A multi-modal based HAR approach is suggested. Depth, RGB and Skelton data are employed to evaluate the multimodal performance of the proposed approach. Depth Motion Map and Motion History Images are trained separately using the 5S-CNN model. On the other side, Skelton images are trained with the 5S-CNN and Bi-LSTM models. v In order to improve the rate of identification and correctness, the skeleton representation gets trained via the use of hybrid classification algorithms, namely the 5S-CNN and Bi-LSTM model. Next, the process of decision-level fusion is used to combine the score values obtained from three distinct movements. Ultimately, the activity of persons is determined according to how they combine value. To assess the effectiveness of the proposed 5S-CNN using the Bi-LSTM approach, an estimation is conducted. A lightweight pre-trained model is suggested, which takes fewer parameters in comparison to other models and evaluates the model's performance on various parameters. The main objective is to assess a recent pre-tested model in HAR that takes less time and data in the training phase. These models are efficient for mobile edge devices, which require less computation power as compared to other traditional DL models. The suggested approach performs well compared to the various recent techniques. This study describes the effectiveness of the vision transformer in action recognition. The sophisticated design of Vision Transformer models enables them to categorize activity properly. By using the UCF 50 dataset, we conducted an effective measure of these models and state-of-the-reference techniques to evaluate their respective efficacy. This research conducted a comparative study of several assessment measures, such as f1-score, precision, and recall, to assess the performance of the model. The proposed approach performs well compared to the state of the reference model. This thesis concludes with a discussion of significant findings and prospective research directions in the domain of HAR.en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTD-7055;-
dc.subjectHUMAN ACTIVITY RECOGNITIONen_US
dc.subjectMACHINE LEARNING ALGORITHMSen_US
dc.subjectAUTOMATED SYSTEMen_US
dc.subjectEXTRACTION TECHNIQUESen_US
dc.subjectVIDEOSen_US
dc.subject5S-CNNen_US
dc.subjectBi-LSTMen_US
dc.titleHUMAN ACTIVITY RECOGNITION IN VIDEOS USING MACHINE LEARNING ALGORITHMSen_US
dc.typeThesisen_US
Appears in Collections:Ph.D. Computer Engineering

Files in This Item:
File Description SizeFormat 
RAHUL KUMAR Ph.D..pdf3.95 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.