MULTILINGUAL DEPRESSION DETECTION IN ONLINE SOCIAL MEDIA ACROSS EIGHT INDIAN LANGUAGES

JAYANT, RAJDERKAR VIRAJ

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20698

Title:	MULTILINGUAL DEPRESSION DETECTION IN ONLINE SOCIAL MEDIA ACROSS EIGHT INDIAN LANGUAGES
Authors:	JAYANT, RAJDERKAR VIRAJ
Keywords:	DEPRESSION DETECTION SOCIAL MEDIA ANALYSIS INDIAN LANGUAGES SENTIMENT ANALYSIS LSTM AND GRU MODELS LOW-RESOURCE LANGUAGR
Issue Date:	May-2024
Series/Report no.:	TD-7191;
Abstract:	Detecting depression via social media platforms has emerged as an important field of studies, leveraging machine learning techniques to analyze user-generated posts across diverse linguistic and cultural contexts. This provides a unique research opportunity into depression detection on social media, specialising in multilingual analysis across various Indian languages. The methodology encompasses data collection, preprocessing, model improvement, and experimental design. Social media tweets, sourced from Twitter, have been gathered and categorized into depressive and non-depressive subsets based on keyword analysis. Ethical issues had been paramount in the course of the data collection and labelling part, ensuring privacy and compliance with moral recommendations. Data preprocessing strategies, consisting of textual content cleaning, normalization, and tokenization, were employed to put together the dataset for analysis. Word embedding with pre-trained FastText vectors improved the semantic representation of the entered text, contributing to model overall performance. The improvement of Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) models, as an ensemble method, enabled powerful sentiment analysis throughout the eight Indian languages. Experimental results confirmed the efficacy of these models in detecting depressive styles in social media posts. High accuracy, precision, recall, and F1-scores were observed throughout languages such as Gujarati, Kannada, Hindi, Bengali, Telugu, Marathi, Malayalam, and Tamil. The models exhibited strong performance, indicating their adaptability to numerous linguistic landscapes and their potential to become aware of depressive content with reliability. Comparative evaluation found out that each LSTM and GRU fashions executed well, with mild variations in accuracy and recall. Ensemble strategies combining LSTM and GRU predictions confirmed improved precision and recall in a few instances, highlighting the capability for boosting overall performance through ensemble approaches. However, variations in precision and recall throughout languages underscored the significance of thinking about language-precise nuances in model improvement. v. Despite the promising outcomes, several limitations have been noticed, such as dataset representativeness and language nuances. The effectiveness of the models can be improved by using a balanced dataset. Furthermore, the diverse linguistic landscape of Indian languages brought complexities which could require further exploration in future research endeavors. In conclusion, this study contributes precious insights into depression detection on social media in multilingual contexts. The developed models show strong performance across diverse Indian languages, imparting potential in depression detection and intervention. Recommendations for further studies consist of expanding datasets, developing language-specific models, and exploring superior architectures to deal with the mentioned challenges and enhance the effectiveness of depression detection models on social media systems.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/20698
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
RAJDERKAR VIRAJ JAYANT M.Tech..pdf		6.68 MB	Adobe PDF	View/Open

Show full item record