CYBERBULLING DETECTION IN BENGALI LANGUAGE USING TRANSFER LEARNING

GUPTA, DIVYANSH

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21852

Title:	CYBERBULLING DETECTION IN BENGALI LANGUAGE USING TRANSFER LEARNING
Authors:	GUPTA, DIVYANSH
Keywords:	CYBERBULLING DETECTION BENGALI LANGUAGE TRANSFER LEARNING BANGLABERT
Issue Date:	May-2025
Series/Report no.:	TD-8075;
Abstract:	The increasing use of social media platforms in Bengali-speaking societies has created a potential for cyberbullying to occur. Hate comments, whether political or sexual in nature, abound, and there are detection systems in place, but they don’t consider the context and culture of Bangla, which is simply English-centric. This project looks to solve this issue by creating a multi-class cyberbullying detection model using BanglaBERT, a transformer-based language model trained specifically for Bangla. The proposed system was trained and evaluated using a real-world dataset of social media comments containing sub-categories of them as neutral, sexual comments, threats, political trolling and trolling. Under the scope of data science, hexadecimal, or in simple terms, character systems are performed with the goal to ensure the accuracy of quantitative data gathered through measuring instruments. The dataset was meticulously cleaned, normalized, and encoded before any analysis was done on it. The supervised learning technique was applied to further fine-tune the BanglaBERT model with the dataset at hand. Primary key performance indicators which are system accuracy, precision, recall, F1- score, and confusion matrix were used to assess the effectiveness of the system. Based on the evaluation criteria set, the model reached a tailor-made accuracy of training set of fifty five percent alongside stark detection of political and threat content and striking rat policy content. As underscored by the classification report, the precision met the requirements laid over the strategies while the outcome from the confusion matrix upheld the boundaries close misstatements between components of closely related classifiers. This thesis showcases the efficacy of transformer-based models in detecting malicious content for under-researched languages. The model’s performance underlines the need for developing answer specific frameworks concerning the safety of the internet. Moreover, the research serves as a foundation for future development works geared towards context sensitive identification of harmful material and actual use in content moderation systems.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/21852
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
Divyansh Gupta M.Tech.pdf		419.51 kB	Adobe PDF	View/Open

Show full item record