Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/15290
Title: | ANALYSIS OF AUTOMATIC TEXT SUMMARIZATION TECHNIQUES |
Authors: | KUMAR, HEMANT |
Keywords: | TEXT SUMMERIZATION CLUSTERING EXTRACTIVE KMEANS ALGORITHM LEXRANK REDUCTION PAGERANK ATS MDS |
Issue Date: | Oct-2016 |
Series/Report no.: | TD NO.2579; |
Abstract: | It’s been 58 years since the publication of Luhn’s original paper on automatic summarization. Automatic Text Summarization (ATS) is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document. As the problem of information overloading has grown and the quantity of data has increased, subsequently the interest in automatic summarization has also flourished. It is very difficult for human beings to manually summarize large textual documents. Text Summarization methods can be classified into extractive and abstractive summarization. Extractive summarization technique includes choosing vital sentences, paragraphs of text and linking them along in a shorter form. Because of the presence of this technique, simplicity, high-speed report, prices and study time reduction may be mentioned. One disadvantage of this technique is that the extracted sentences could also be too moderate. Vital and relevant information might also be distributed between sentences and extractive technique cannot establish them. The overall method of extractive summarization is performed in two steps: pre-processing and processing. In the pre-processing step, identifying the end of sentences, removing terms that do not have any meaning and finding the word roots are performed. In the process step, the effectiveness and connection of the sentences to the subject area unit known and appointed a weight to every sentence after this by use of PageRank algorithm to ranking them according to their weights. At last, sentences with the minimum scores are chosen to be in final summarized text. In this project, we present some techniques for generating text extraction based summaries including Pattern matching, K-means clustering, Reduction, TextRank algorithm and LexRank algorithm. In frequency based technique obtained summary makes more meaning. But in k-means clustering due to out of order extraction, summary might not make sense. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/15290 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
front_cover---done!.pdf | 181.11 kB | Adobe PDF | View/Open | |
Final_THESIS_hemant.pdf | 1.77 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.