Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/15032
Title: DATA MINING USING SUPERVISED MACHINE LEARNING TECHNIQUES
Authors: RITU
Keywords: DATA MINING
MACHINE LEARNING
NAIVE BAYES
KNN
Issue Date: Aug-2016
Series/Report no.: TD NO.2310;
Abstract: Every day human beings are generating vast data and this data comes from different sources, be it online or offline. It may be in the form of documents, may be in graphical formats, may be the video or may be the records (varying array). Since the data is available in different formats, appropriate action needs to be taken not only to analyze the data but also to fetch important information and patterns from it and maintain the data .The data should be made available as and when required by the clients. The data should be retrieved from the database to help them make better decision .This technique is actually what we call data mining. Machine learning is a subfield of computer science which involves the study and construction of algorithms that can learn from and make predictions on data. First, a model is built from a training set of input observations so that the predictions can be data driven and then machine learning algorithms operates on test data for generating the predicted outcome. Here strict static program instructions are not followed. Instead historical data is considered for making the prediction. Within the field of data analytics, machine learning is a method used to devise complex models and algorithms that lend themselves to prediction. These analytical models allow researchers, data scientists, engineers, and analysts to "produce reliable, repeatable decisions and results" and uncover "hidden insights" through learning from historical relationships and trends in the data. In this work, two popular machine learning algorithms: k-nearest neighbours and naïve bayes are studied and a new hybrid algorithm have been proposed using the best features of both these algorithms. In recent years, there has been a dramatic increase in the use of machine learning techniques within the healthcare systems to analyse, predict and classify clinical data. Therefore, we have selected 3 datasets containing the health related data. All three algorithms have been implemented in python and have been run against all the datasets.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/15032
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
RituThesis.pdf1.51 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.