STUDY AND ANALYSIS OF BIG DATA ANALYTICS FRAMEWORKS AND CHALLENGES

SHEKHAWAT, SAURABH SINGH

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/19847

Full metadata record

DC Field	Value	Language
dc.contributor.author	SHEKHAWAT, SAURABH SINGH	-
dc.date.accessioned	2023-06-12T09:33:47Z	-
dc.date.available	2023-06-12T09:33:47Z	-
dc.date.issued	2023-05	-
dc.identifier.uri	http://dspace.dtu.ac.in:8080/jspui/handle/repository/19847	-
dc.description.abstract	Every day as we can see around us that data is generating exponentially. We are the reason the reason for that amount of data today, an individual generating on an average 40 Exabyte’s of data daily. The data can be come from any sources like social media, online transactions, IOT’s, digital media, records, different sensors etc. handling this huge amount of data nowadays becoming a challenging task. The data can be big or small in the size and can be of any form like unstructured, semi-structured or structured. We can’t handle these amount of big data with the traditional techniques. Therefore, in order to handle such large amounts of unstructured data, we need methods and mechanisms that are simple to use, quick to process, and effective. The two main technological advancements that can manage any type of information are Hadoop and Spark. for storing, processing, and analysing the data, there are many tools and techniques are used in the Hadoop and spark. The Hadoop framework data is processing the data in distributing manner. The two basis elements of Hadoop are HDFS for storage, MapReduce and yarn for parallel processing in distributed manner, scheduling the data(tasks) and analalyzing the data. The second one spark uses resilient distribute data sets for fast processing for overcome computational complexity. In this report we will see what is the Hadoop architecture, how it stores and process the data using MapReduce, how spark is better than Hadoop, how sparks done the job, what is the Apache spark technology and Hadoop and spark’s comparative analysis.	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	TD-6407;	-
dc.subject	BIG DATA	en_US
dc.subject	HDFC	en_US
dc.subject	HANDOOP	en_US
dc.subject	SPARK	en_US
dc.subject	ARCHITECTURE	en_US
dc.title	STUDY AND ANALYSIS OF BIG DATA ANALYTICS FRAMEWORKS AND CHALLENGES	en_US
dc.type	Thesis	en_US
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
Saurabh Singh Shekhawat M.Tech.pdf		4.42 MB	Adobe PDF	View/Open

Show simple item record