Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/123456789/514
Title: A PARTITIONED APPROACH TO RULE MINING
Authors: NAGPAL, MONIKA
Keywords: MINING
Issue Date: 3-Dec-2010
Series/Report no.: TD 713;
Abstract: Database mining is the process of extracting interesting and previously unknown patterns and correlations from data stored in Data Base Management Systems (DBMSs). Association rule mining is the process of discovering items, which tend to occur together in transactions. If the data to be mined were stored as relations in multiple databases, instead of moving data from one database to another, a partitioned approach would be appropriate. Mining for association rules between items in a large database of sales transactions has been described as an important database mining problem. In this paper we present an efficient algorithm for mining association rules that is fundamentally different from known algorithms. Compared to previous algorithms, our algorithm not only reduces the I/O overhead significantly but also has lower CPU overhead for most cases. Our approach uses SQL-based K-way join algorithm and its optimizations. Our results indicate that, beyond a certain size of data sets, the accuracy is preserved with this approach and results in better performance. We have performed extensive experiments and compared the performance of our algorithm with one of the best existing algorithms. It was found that for large databases, the CPU overhead was reduced by as much as a factor of four and I/O was reduced by almost an order of magnitude. Hence this algorithm is especially suitable for very large size databases.
Description: ME THESIS
URI: http://dspace.dtu.ac.in:8080/jspui/handle/123456789/514
Appears in Collections:M.E./M.Tech. Computer Technology & Applications

Files in This Item:
File Description SizeFormat 
firstpar.doc75 kBMicrosoft WordView/Open
par app.doc397.5 kBMicrosoft WordView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.