Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/22750
Full metadata record
DC FieldValueLanguage
dc.contributor.authorSINGH, ROSHNI-
dc.contributor.authorSharma, Abhilasha (SUPERVISOR)-
dc.date.accessioned2026-06-08T05:40:18Z-
dc.date.available2026-06-08T05:40:18Z-
dc.date.issued2025-12-
dc.identifier.urihttp://dspace.dtu.ac.in:8080/jspui/handle/repository/22750-
dc.description.abstractThis thesis aims to develop robust and adaptive methods for activity recognition in challenging environments involving occlusions, low visibility, complex motion patterns, and dynamic backgrounds. While conventional methods have shown promise using handcrafted features or template-based models, their performance significantly degrades under real-world conditions due to limitations in generalization, sensitivity to noise, and dependency on clean, well-labeled datasets. To address these issues, the work explores multiple directions, including skeleton-based recognition, spatialtemporal modeling, attention mechanisms, low-light enhancement, and multimodal fusion. The proposed methods are designed to enhance both the accuracy and robustness of recognition systems in real-world settings. Initially, the thesis outlines a systematic literature review that analyzes 88 key publications from 2014 to 2024, selected from over 8,664 research papers. This review categorizes state-of-the-art HAR techniques based on their architectures, datasets, evaluation strategies, and challenges, highlighting the research gaps in handling real-time, noisy, and occluded scenarios. Based on these insights, a set of machine learning and deep learning frameworks, models and algorithms are proposed. The second work introduced a ConvST-LSTM-Net for skeleton-based activity recognition. This model identifies and processes only the most informative skeletal keyjoints in each frame, leveraging convolutional and spatiotemporal LSTM layers for effective long-term sequence modeling. To capture subtle spatial-temporal variations in video clips, a spatial-temporal attention-based, i.e., STAD-ConvBi-LSTM model is developed in the third work. This architecture integrates a dual attention mechanism with convolutional and bi-directional LSTM networks to extract discriminative humancentric features. The method demonstrates exceptional performance across various datasets and a custom synthetic dataset, achieving recognition accuracies exceeding 96%. For recognizing the challenge of occlusion in skeleton-based data, a MultiStream Part-Aware Spatial-Temporal Graph Convolutional Network as MSPAST-GCN is proposed. This model uses a part-aware inhibition strategy and a graph convolutionbased architecture to effectively model keyjoint relationships, even in the presence of missing or noisy data. It outperforms prior methods with a 6% accuracy gain on occlusion-affected datasets. For video-based activity classification, a hybrid model named MV-DBiLSTM is presented, which combines MobileNetV2 for spatial feature extraction with a Deep Bi-LSTM network for learning temporal dependencies. This framework balances computational efficiency and deep temporal reasoning, making it suitable for deployment in smart systems. In visually challenging conditions like lowlight environments, where traditional recognition systems face challenges. This thesis proposes a low-light enhancement pipeline integrated with HAR models. A combination of local enhancement modules and transformer-based global adjustment is used to improve visibility without distorting critical features. This significantly improves activity detection in surveillance scenarios under poor lighting. All proposed models are rigorously validated across benchmark and synthetic datasets using both quantitative and qualitative assessments. The analysis demonstrates that all the presented methods outperform contemporary approaches in terms of recognition accuracy, temporal consistency, and adaptability to diverse real-world conditions. Overall, this thesis contributes multiple novel activity recognition architectures tailored for different challenges: occlusion, temporal complexity, lighting conditions, and data constraints. These contributions enable the development of more smart, intelligent, reliable, and context-aware recognition systems, with impactful applications in surveillance, healthcare, smart homes, and assistive technologies.en_US
dc.language.isoenen_US
dc.relation.ispartofseriesTD-8656;-
dc.subjectPREDICTIVE MODELen_US
dc.subjectMACHINE LEARNINGen_US
dc.subjectACTIVITY RECOGNITIONen_US
dc.subjectLSTM MODELen_US
dc.titleDESIGN AND DEVELOPMENT OF PREDICTIVE MODEL FOR ACTIVITY RECOGNITION USING MACHINE LEARNINGen_US
dc.typeThesisen_US
Appears in Collections:Ph.D. Computer Engineering

Files in This Item:
File Description SizeFormat 
ROSHNI SINGH Ph.D..pdf16.3 MBAdobe PDFView/Open
ROSHNI SINGH Plag..pdf26.72 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.