Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/14201
Title: AUTOMATIC EVALUATING TEXT COHERENCE USING DISCOURSE RELATIONS
Authors: RATN SHAH, RAJIV
Keywords: PDTB
M.TECH
AUTOMATIC EVALUATING
DISCOURSE RELATIONS
Issue Date: 20-Jun-2013
Series/Report no.: TD -1173;
Abstract: This thesis considers the problem of automatic evaluation of text coherence. The task of text coherence in linguistics is what makes a text semantically meaningful. Automatic Evaluating Text Coherence, the task of determining which text is more coherent between given pair of text and its sentence ordered permutation. This work has been at the core of the field of Natural Language Processing for the past few years. Natural Language Processing is often described as a discipline that relates to human ability with what computers can do. Study of Natural Language Processing helps us to achieve human level performance through computers. The task of determining which given text is more coherent is very important and challenging problem in Natural Language Processing. One famous application of text coherency can be applied to impose an order on sentences for multi-document summarization. With the tremendous growth of data, users are expecting more relevant and sophisticated information which may be determined by Text Summarization. Natural Language Processing is often described as a discipline to develop applications related to human linguistics. It involves different techniques and algorithms to determine which text is more coherent between given pair of text and its sentence ordered permutation and it can be applicable to NLP application of text summarization. The idea of modeling automatic evaluating text coherence may apply to differentiating a text from its permutation (i.e., the sentence ordering of the text is shuffled) and identifying a more well-written essay from a pair. In this thesis, we propose a novel approach for determining the automatic evaluating text coherence which is the combination of the new and other related text coherency techniques. Also we prove its effectiveness over various previous techniques such as Entity Grid Relations and Discourse Relations over Entity Grid Model. Entity Grid Relations is the first popular technique for automatic evaluating text coherence but accuracy of this model is quite less than human level performance of task of evaluating text coherence. To improve the accuracy of Entity Grid Model, Discourse Relations imposed over Entity Grid Model have become popular (Lin et. al., 2011). (Lin et. al., 2011) have employed a Discourse Relation Matrix to determine discourse relation transitions of different length. However, the accuracy of Discourse Relations Model is still less than accuracy of human evaluator in evaluating text coherence task. Our proposed model is able to decide which text is more coherent. We have presented a novel approach to combine few independent semantic features to determine the coherency of text. Our study of linguistics tells us that co-reference plays a vital role in determining the coherency of a text. In particular there exist model of the noun phrase syntax used for distance (named hobb distance) between noun phrase and its co-reference with statistical distribution of the discourse structure and relations. We have considered the text coherency problem as ranking learning problem because for a given pair a text is more coherent than the other. Our system ranks high coherent text with higher score. Our experiments have shown that combining these features together lead to improvement in accuracy of automatic evaluating text coherence. We apply Penn Discourse Treebank (PDTB) discourse relations values (Lin et. al., 2011) and Noun Phrase co-reference over the Entity Grid Model by Barzilay and Lapata (2005; 2008), a popular model of local coherence. Our experiments and results demonstrate that our model achieves higher accuracy than baseline model. The accuracy of our system is closest to the accuracy of human evaluators than other existing model for automatically evaluating text coherence.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/14201
Appears in Collections:M.E./M.Tech. Computer Technology & Applications

Files in This Item:
File Description SizeFormat 
Rajiv_Final_Thesis-1.pdf1.06 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.