Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21808
Title: REINFORCED ATTENTION FOR VIDEO SUMMARISATION
Authors: RAY, SIMRAN
Keywords: REINFORCED ATTENTION
VIDEO SUMMARISATION
GLASN
Issue Date: Jun-2025
Series/Report no.: TD-8019;
Abstract: Video summarization is a critical task for enabling efficient browsing, retrieval, and storage of large-scale video content by generating concise yet informative summaries. In this paper, we propose the Global and Local Attention-based Video Summarization Network (GLASN), a novel framework that combines global and local attention mecha nisms with positional encoding to model both long-range dependencies and local temporal dynamics within video sequences. By leveraging the combination attention framework, GLASN selectively focuses on semantically important frames while maintaining the global context necessary for coherent summaries. We formulate video summarization as a se quential decision-making problem and adopt a reinforcement learning (RL) framework, optimizing GLASN with reward functions that promote both diversity and representative ness—key factors for high-quality summaries. Importantly, our approach is fully unsuper vised, eliminating the need for labor-intensive, human-annotated labels, which is crucial for scalability in real-world applications where annotating large volumes of data is infeasi ble. Extensive experiments on benchmark datasets demonstrate that GLASN effectively captures the essence of video content and outperforms or competes with state-of-the-art methods, showcasing the benefits of attention-based architectures and unsupervised RL training for video summarization.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21808
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
SIMRAN RAY M.Tech..pdf988.85 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.