ANALYSIS OF AUTOMATIC TEXT SUMMARIZATION TECHNIQUES

KUMAR, HEMANT

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/15290

Title:	ANALYSIS OF AUTOMATIC TEXT SUMMARIZATION TECHNIQUES
Authors:	KUMAR, HEMANT
Keywords:	TEXT SUMMERIZATION CLUSTERING EXTRACTIVE KMEANS ALGORITHM LEXRANK REDUCTION PAGERANK ATS MDS
Issue Date:	Oct-2016
Series/Report no.:	TD NO.2579;
Abstract:	It’s been 58 years since the publication of Luhn’s original paper on automatic summarization. Automatic Text Summarization (ATS) is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document. As the problem of information overloading has grown and the quantity of data has increased, subsequently the interest in automatic summarization has also flourished. It is very difficult for human beings to manually summarize large textual documents. Text Summarization methods can be classified into extractive and abstractive summarization. Extractive summarization technique includes choosing vital sentences, paragraphs of text and linking them along in a shorter form. Because of the presence of this technique, simplicity, high-speed report, prices and study time reduction may be mentioned. One disadvantage of this technique is that the extracted sentences could also be too moderate. Vital and relevant information might also be distributed between sentences and extractive technique cannot establish them. The overall method of extractive summarization is performed in two steps: pre-processing and processing. In the pre-processing step, identifying the end of sentences, removing terms that do not have any meaning and finding the word roots are performed. In the process step, the effectiveness and connection of the sentences to the subject area unit known and appointed a weight to every sentence after this by use of PageRank algorithm to ranking them according to their weights. At last, sentences with the minimum scores are chosen to be in final summarized text. In this project, we present some techniques for generating text extraction based summaries including Pattern matching, K-means clustering, Reduction, TextRank algorithm and LexRank algorithm. In frequency based technique obtained summary makes more meaning. But in k-means clustering due to out of order extraction, summary might not make sense.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/15290
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
front_cover---done!.pdf		181.11 kB	Adobe PDF	View/Open
Final_THESIS_hemant.pdf		1.77 MB	Adobe PDF	View/Open

Show full item record