Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/19204
Title: | IMAGE PARAGRAPH GENERATION USING DEEP LEARNING |
Authors: | TIWARI, AVANISH |
Keywords: | IMAGE PARAGRAPH DEEP LEARNING HIERARCHICAL MODEL RNN CNN |
Issue Date: | May-2022 |
Series/Report no.: | TD-5770; |
Abstract: | Recently, a neural network based approach to automatic generation of image descriptions has become popular. Originally introduced as neural image captioning, it refers to a family of models where several neural network components are connected end-to-end to infer the most likely caption given an input image. Neural image captioning models usually comprise a Convolutional Neural Network (CNN) based image encoder and a Recurrent Neural Network (RNN) language model for generating image captions based on the output of the CNN. Generating long image captions – commonly referred to as paragraph captions – is more challenging than producing shorter, sentence-length captions. When generating paragraph captions, the model has more degrees of freedom, due to a larger total number of combinations of possible sentences that can be produced. In this thesis, we describe a combination of two approaches to improve paragraph captioning: using a hierarchical RNN model that adds a top level RNN to keep track of the sentence context, and using richer visual features obtained from dense captioning networks. In addition to the standard MS-COCO Captions dataset used for image captioning, we also utilize the Stanford-Paragraph dataset specifically designed for paragraph captioning. This thesis describes experiments performed on three variants of RNNs for generating paragraph captions. The flat model uses a non-hierarchical RNN, the hierarchical model implements a two level, hierarchical RNN, and the hierarchical-coherent model improves the hierarchical model by optimizing the coherence between sentences. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/19204 |
Appears in Collections: | M.E./M.Tech. Electronics & Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Avanish Tiwari_M.Tech.pdf | 1.87 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.