Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/18766
Title: | SENTIMENT ANALYSIS ON SOCIAL MEDIA USING SOFT COMPUTING TECHNIQUES |
Authors: | JAISWAL, ARUNIMA |
Keywords: | VOLATILE-UNCERTAIN-COMPLEX-CHAOTIC-AMBIGUOUS (VUCCA) ONVOLUTION NEURAL NETWORK WOLF- SERACH ALGORITHM AND DECISION TREE TEXTUAL TOPIC BASED |
Issue Date: | 2020 |
Publisher: | DELHI TECHNOLOGICAL UNIVERSITY |
Series/Report no.: | TD - 5261; |
Abstract: | “The analysis of variance is not a mathematical theorem, but rather a convenient method of arranging arithmetic” ----Ronald Fisher Social media can be described as the VUCCA world that is Volatile-Uncertain-Complex-Chaotic-Ambiguous that generates enormous amount of online user content, which can further be examined to get insights for social intelligence. Undeniably, with the quantum of opinionated data on social media, sentiment analysis now finds use in various marketing, business and government applications. But the noise, high-dimensionality, imbalance, heterogeneity, multimodality and multi-linguality associated with the social media data makes the task of sentiment analysis challenging. Further, the growing use of micro-texts (creative spellings, slangs etc.) compounds the linguistic challenges of sentiment analysis. Good features are considered as the backbone for any learning model, and good feature creation often needs adequate domain knowledge, creativity and time. This necessitates examining new computational methodologies for finding optimal feature set which improves the performance of the sentiment classifier in terms of predictive accuracy and result comprehensibility. One such consortium of techniques is referred to as soft computing, which provides robust and low cost solutions that could cater well with these upshots. In this research, we examine sentiments using soft computing on benchmark (SemEval 2016 & 2017, Sentimentl40, IMDb movie review corpus) and scrap (textual topic based) data from social media namely, Twitter, Tumblr, etc. Experiments for sentiment analysis using TF-IDF are conducted on these datasets using ensemble (random forests, bagging, boosting, gradient boosting, stochastic gradient boosting and extra trees) and baseline machine learning techniques (naive bayesian, support vector machine, multilayer perceptron, decision tree and k- nearest neighbour). This is followed by the application of swarm intelligence techniques (namely particle swarm, binary grey wolf, binary moth flame) for feature optimization on Twitter (benchmark) datasets for enhanced textual sentiment analysis. Also, in this study, catering to the challenge of selecting the essential features each time, which is altogether a computationally hard task, deep convolution neural network using GloVe is used which automatically learns features at multiple level of abstraction without depending completely on hand-crafted features. Deep learning techniques have hierarchical learning capabilities and at the same time, the use of adaptive and heuristic optimization to select a near-optimal set of input variables that would minimize variance and maximize generalizability of the learning model, is highly desirable to achieve high prediction accuracy. Based on this, we finally propose a cognition-driven model for sentiment classification which is built on the concord of deep learning (convolution neural network), swarm optimized machine learning (wolf-search algorithm and decision tree). All the results are evaluated using accuracy, precision & recall. The proposed model compares favourably to state-of-the-art approaches and achieves an average performance accuracy of 89.5 on SemEval 2016 & 2017 datasets. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/18766 |
Appears in Collections: | Ph.D. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Arunima Thesis 2K16PhD.pdf | 3.95 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.