Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18821
Title: SENTIMENT ANALYSIS ON TWITTER DATA
Authors: YADAV, DEEPIKA
Keywords: TWITTER DATASET
SENTIMENT CLASSIFICATION
IMDb DATASET
POLARITY DATASET
Issue Date: Oct-2020
Publisher: DELHI TECHNOLOGICAL UNIVERSITY
Series/Report no.: TD - 5353;
Abstract: Prior to purchasing an item, individuals for the most part go to different shops in the market, question about the item, cost, and guarantee, and afterward at long last purchase the item dependent on the feelings they got on cost and nature of administration. This procedure is tedious and the odds of being cheated by the merchant are more as there is no one to direct regarding where the purchaser can get valid item and with legitimate expense. Be that as it may, presently a-days a decent number of people rely upon the upon line showcase for purchasing their necessary items. This is on the grounds that the data about the items is accessible from numerous sources; in this manner, it is relatively modest and furthermore has the office of home conveyance. Once more, before experiencing the way toward setting request for any item, clients all the time allude to the remarks or audits of the current clients of the item, which assist them with taking choice about the nature of the item just as the administration gave by the dealer. Like putting request for items, it is seen that there are many experts in the field of films, who experience the film and afterward at long last give a remark about the nature of the film, i.e., to watch the film or not or in five-star rating. These audits are basically in the content arrangement and at times extreme to comprehend. In this manner, these reports should be prepared suitably to get some important data. Order of these audits is one of the ways to deal with extricate information about the surveys. In this theory, distinctive AI procedures are utilized to characterize the audits. Reproduction and trials are done to assess the exhibition of the proposed grouping strategies. It is seen that a decent number of scientists have frequently thought to be two distinctive survey datasets for conclusion grouping to be specific ascension and Polarity dataset. The IMDb dataset is separated into preparing and testing information. Accordingly, preparing information are utilized for preparing the AI calculations and testing information are utilized to test the information dependent on the preparation data. Then again, extremity dataset doesn't have separate information for preparing and testing. In this way, k-crease cross approval procedure is utilized to order the surveys. Four diverse AI strategies (MLTs) viz., Naive Bayes (NB), Support Vector Machine (SVM), Random Forest (RF), and Linear Discriminant Analysis (LDA) are utilized for the order of these film audits. Diverse execution assessment boundaries are utilized to assess the presentation of the AI strategies. It is seen that among the over four AI calculations, RF method yields the grouping result, with more precision. Also, n-gram based characterization of surveys is completed on the ascension dataset. v The distinctive n-gram procedures utilized are unigram, bigram, trigram, unigram bigram, bigram + trigram, unigram + bigram + trigram. Four distinctive AI strategies, for example, Naive Bayes (NB), Maximum Entropy (ME), Support Vector Machine (SVM), and Stochastic Gradient Descent (SGD) methods are utilized to arrange the film surveys dependent on the n-gram strategy as referenced before. Diverse execution assessment boundaries are utilized to assess the presentation of these AI methods. The SVM method with unigram + bigram approach has demonstrated more exact outcome among every different methodologies. Thirdly, SVM-based element determination strategy is utilized to choose best highlights from the arrangement everything being equal. These chose highlights are then considered as contribution to Artificial Neural Network (ANN) to characterize the surveys information. For this situation, two distinctive audit datasets i.e., IMDb and Polarity dataset are considered for grouping. In this technique, each expression of these surveys is considered as a component, and the assumption estimation of each word is determined. The component choice is done dependent on the opinion estimations of the expression. The words having higher assumption esteems are chosen. These words at that point go about as a contribution to ANN based on which the film audits are ordered. At last, Genetic Algorithm (GA) is utilized to speak to the film surveys as chromosomes. Various activities of GA are completed to get the last arrangement result. Alongside this, the GA is likewise utilized as highlight choice to choose the best highlights from the arrangement of all highlights which in the end are given as contribution to ANN to acquire the last grouping outcome. Distinctive execution assessment boundaries are utilized to assess the presentation of GA and half breed of GA with ANN. Feeling examination regularly manages investigation of surveys, remarks about any item, which are for the most part printed in nature and need legitimate preparing to got any significant data. In this postulation, various methodologies have been proposed to arrange the audits into particular extremity gatherings, i.e., positive and negative. Distinctive MLTs are utilized in this theory to play out the errand of arrangement and execution of every strategy is assessed by utilizing various boundaries, viz., exactness, review, f-measure and precision. The outcomes acquired by the proposed approaches are seen as better than the outcomes as announced by different creators in writing utilizing same dataset and approaches.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/18821
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
Thesis_Deepika.pdf2.03 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.