Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21799
Title: EVALUATING OPEN-SOURCE VISION-LANGUAGE MODELS FOR HATEFUL MEME DETECTION
Authors: MALLINATH, VHATKAR GANESH
Keywords: VISION-LANGUAGE MODELS
HATEFUL MEME DETECTION
AUROC
LoRA
Issue Date: May-2025
Series/Report no.: TD-8010;
Abstract: Detecting hateful content in internet memes poses a unique challenge due to the tight coupling of visual and textual information. We present a systematic evaluation of five open-source vision-language models across three practical scenarios—zero-shot prompting, few-shot in-context learning, and parameter-efficient fine-tuning with Low-Rank Adap tation (LoRA), all executed on freely available Kaggle T4 GPUs. Our zero-shot exper iments highlight substantial performance swings driven by prompt design, emphasizing the need for careful prompt engineering. Introducing just two to four labeled examples in few-shot settings consistently improves classification, with top models exceeding 64% accuracy and macro-F1. Most notably, after only five epochs of LoRA fine-tuning, our best model delivers an AUROC of 85.81%, coming within 1.19 points of the state-of the-art Retrieval-Guided Contrastive Learning benchmark (87.0% AUROC). By unifying evaluation protocols and demonstrating resource-aware methods, this work shows that near-state-of-the-art AUROC is achievable under tight computational constraints, mak ing robust hateful meme detection more accessible for real-world moderation.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21799
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
Vhatkar Ganesh Mallinath M.Tech..pdf994.37 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.