Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/15938
Title: A STUDY ON FEATURE SUBSET SELECTION FOR HIGH DIMENSIONAL DATA
Authors: CHANDEL, AYUSH
Keywords: FEATURE SUBSET SELECTION
HIGH DIMENSIONAL DATA
HYBRID ALGORITHM
DICE
Issue Date: Jul-2017
Series/Report no.: TD-2917;
Abstract: A high dimensional data is a data consisting thousands of attributes or features. Nowadays for scientific and research applications high dimensional data is used. But as there are thousands of features present in the data, we need to select the features those are non-redundant and most relevant in order to reduce the dimensionality and runtime, and also improve accuracy of the results. In this thesis we provide an overview of some of the methods which are present in literature. A study is done on the existing methods and a HYBRID algorithm for feature selection which incorporates the clustering aspects of FAST feature selection algorithm and similarity measure of a DICE coefficient. The efficiency and accuracy of the results is evaluated by empirical study. In this thesis, we have presented a novel clustering-based feature subset selection algorithm for high dimensional data. The algorithm involves (i) removing irrelevant features, (ii) constructing a minimum spanning tree from relative ones, and (iii) partitioning the MST and selecting representative features. In the proposed algorithm, a cluster consists of features. Each cluster is treated as a single feature and thus dimensionality is highly reduced. The Proposed System will be Implementation of FAST algorithm along with the DICE coefficient to remove irrelevant and redundant features.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/15938
Appears in Collections:M.E./M.Tech. Computer Engineering



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.