Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/15820
Title: | DOCUMENT CLASSIFICATION USING UNIQUE AND ELITE KEYWORDS BASED ON ENTROPY BASED PARTITIONING |
Authors: | KESHARI, JULI |
Keywords: | DOCUMENT CLASSIFICATION ELITE KEYWORDS UNIQUE KEYWORDS PARTITIONING |
Issue Date: | Jun-2017 |
Series/Report no.: | TD-2793; |
Abstract: | In this project, we investigate the selection of significant keywords for document classification. We proposed two different schemes for selecting significant keywords, elite and unique elite. Elite Keywords are those keywords that have high term frequency in each class. This is irrespective of the frequencies of these terms in other classes. To get the high occurring terms in each class, we employ entropy based partitioning technique that is usually used in the field of information theory and coding to generate partition between symbol probabilities. So our method has the advantage as compared to other feature selection schemes that we get the exact subset of significant keywords for each class, and we do not rely on hit and trial methods. Unique elite keywords are those that are elite for a particular class and at the same time have higher occurring frequency only in that class. To measure this, we compute the entropy of each elite keyword across all classes, sort the entropies in ascending order and again employ entropy partitioning to shortlist those elite keywords that occur uniquely in this class. Comparison with the state-of-the-art methods on benchmark data sets establishes the efficiency of our method from the high percentage accuracy obtained. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/15820 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Complete_Thesis[1].pdf | 1.4 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.