Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/14243
Title: Improving Join Query Processing in MapReduce Environment
Authors: Shaikh, Anwar Dilawar
Keywords: Optimization Techniques
Issue Date: 11-Jul-2013
Series/Report no.: TD-1046;
Abstract: MapReduce is a framework for processing large data sets, where straightforward computations are performed by hundreds of machines on large input data. Data could be stored and retrieved using structured queries. Join queries are used frequently. So it’s crucial to find out efficient join processing techniques. In this project we have analyzed theoretically and practically various join processing algorithms in MapReduce. We have proposed some techniques for the improvement of performance of join queries and a join selection strategy is proposed to find out best suitable join processing algorithm for a particular application. Comparison of various join processing algorithms is done. Join query processing algorithms studied in this thesis are Default Hadoop Join, Broadcast Join, Optimized Broadcast Join, Trojan join and Multijoin algorithm. These algorithms are compared on the basis of number of MapReduce jobs involved, their advantages and disadvantages. We have proposed optimization techniques such as Dynamic Hash table creation, Compressed Broadcast Join and Hash Broadcast join. Also we have suggested a Join selection strategy which helps to select the join processing algorithm based on various parameters. Also experiments were conducted to measure the performance of these algorithms. Experiments were conducted on Amazon cloud using Elastic MapReduce,EC2 and S3 technologies provided by Amazon Web Services. Results of the experiments proved that the proposed optimization techniques had improved the performance on join query execution in MapReduce environment.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/14243
Appears in Collections:M.E./M.Tech. Computer Technology & Applications

Files in This Item:
File Description SizeFormat 
1-Front page.pdf300.06 kBAdobe PDFView/Open
2-certificate and index.pdf231.32 kBAdobe PDFView/Open
3-main content.pdf1.65 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.