Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20373
Title: DETECTING HATE IN MULTIMODAL MEMES USING MACHINE LEARNING
Authors: SHARMA, VAISHALI
Keywords: DETECTING HATE
MULTIMODAL MEMES
ARTIFICIAL INTELLIGENCE
MACHINE LEARNING
Issue Date: May-2022
Series/Report no.: TD-6795;
Abstract: Social media has ingrained itself into our everyday lives, yet with its extensive use also comes the possibility of seeing offensive material. Artificial intelligence (AI) has established itself as an effective tool in the detection and removal of such information, which is a topic that is becoming more and more significant. However, because humans cannot distinguish between a picture and the text it contains because we interpret the combined meaning, it can be particularly difficult to identify nasty material in memes. Therefore, an AI tool designed to identify harmful memes has to have a thorough knowledge of their substance and context, just like people do. A project was started to automatically categorise memes as hostile or not by integrating text, picture feature information, and extra data from online entity recognition in order to solve this problem. The Multimodal dataset from the Hateful Meme Detection Challenge 2020 was used in the study. Modern visual language models struggle to perform accurately in comparison to non-expert people on this dataset because it contains confounding instances such as memes that are unimportant, contrastive, or counterfactual, demonstrating the difficulty of the task. Models need a richer knowledge of language acquisition, image, current affairs, and the interactions across several modalities if they are to attain high accuracy. The suggested method comprises categorising memes using text, images, and data gathered from the online entity identification procedure. The paper examines ways to strengthen the suggested technique going forward as well as the approach's flaws. The algorithms struggle with effectively detecting people's traits and classifying racial or religious groupings due to their lack of real-world experience. The models also have trouble recognising memes that are evocative of pain, abuse, and incapacity. The models also have trouble comprehending religious practises, traditional clothing, political and social allusions, and cultural norms. The suggested architecture processes text and images simultaneously using two parallel streams and cross-attention training. The bidirectional multi-head attention paradigm serves as the foundation for both streams. The preprocessing pipeline necessary for the suggested design is also included in the study. The study was carried out in two stages, the first of which produced an Area Under the Receiver Operating Characteristic (AUROC) of 0.71 and an accuracy of 0.74 on the dataset of vile memes. The expanded Hateful Meme Detection Dataset showed that the model had an AUROC of 0.8108 on the test unseen data and 0.7555 on the dev unseen data, with an accuracy of 0.7352 for the test unseen and 0.7650 for the dev unseen data. The Hateful Meme Detection dataset was increased to include in order Page | 4 to better broaden the dataset including more memes in Phase-2. The paper recognises the project's limits even if the recommended method has shown promising outcomes. The method relies mainly on linguistic and visual characteristics, which restricts its ability to identify offending memes with delicate or intricate content. In order for the models to become more accurate, they also require an enormous quantity of training data, which can be challenging to get in real-world situations. The study highlights the importance of impartiality, accountability, and openness in the development and use of algorithms as well as the ethical conundrums raised by the application of AI to content moderation. The study's conclusion offered a technique for automatically detecting offensive memes that made use of language, picture feature data, and internet object recognition. Despite the approach's positive results, problems still need to be resolved before the efficacy and accuracy of the models can be raised. Future research in this area may examine the inclusion of other modalities, such audio or video, to improve model performance. The algorithms' understanding of sociological and cultural influences may also be improved in order to boost the models' precision in recognising unfriendly content. Artificial intelligence (AI) must eventually be used with caution and openness in the moderation of material if we are to ensure that these innovations are exploited morally and responsibly.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20373
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
VAISHALI SHARMA m.tECH.pdf2.92 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.