Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21461
Title: DESIGN OF FRAMEWORK FOR ADVERSARIAL ATTACKS & DEFENCES IN CLASSIFICATION MODELS
Authors: BAJAJ, ASHISH
Keywords: ADVERSARIAL ATTACKS & DEFENCES
CLASSIFICATION MODEL
DEEP LEARNING
FRAMEWORK
NLP
Issue Date: Aug-2024
Series/Report no.: TD-7795;
Abstract: The advent of Deep Learning has enabled us to effectively train neural networks to handle intricate datasets with exceptional efficiency. Nevertheless, as research progresses, numerous weaknesses in neural networks have been revealed. Adversarial Machine Learning is a specific area of research that focuses on identifying and exploiting vulnerabilities in neural networks that lead to misclassification of input data that is very similar to the original data. Adversarial assaults refer to a category of methods designed to intentionally cause neural networks to misclassify data across multiple domains and tasks. Our comprehensive review of the extensive and growing research on adversarial attacks has revealed a notable dearth of research in the domain of NLP. This research presents a comprehensive examination of current textual adversarial attacks and their comprehension from many angles in the field of Natural Language Processing. We have created three innovative techniques for adversarial attacks on text, as well as a strategy for defending against such attacks. Additionally, we will end by examining the potential areas for future research in the domain of adversarial machine learning specifically in the textual realm. The investigation illustrates that linguistic frameworks have an inherent vulnerability to adversarial texts, where a few words or characters are altered to create perturbed text that misleads the machine into making incorrect predictions while preserving its intended meaning among human viewers. The present study introduces HOMOCHAR, Non-Alpha-Num & Inflect Text, novel approaches for attacking text that works at character and word level granularity in a situation where the inner workings of the system are unknown. The objective is to deceive a specific neural text classifier while following specified language limitations in a manner that makes the changes undetectable to humans. Extensive investigations are carried out to evaluate the viability of the proposed attack methodologies on various often utilized frameworks, inclusive of Word-CNN, Bi-LSTM, and various advanced transformer models across different benchmark text datasets: AG news, MR, IMDb, Yelp, etc which are commonly employed for text classification tasks. Experimental proof demonstrates that the suggested attack architectures regularly outperform conventional methods by achieving much higher attack success rates (ASR) & generating better adversarial examples. The findings suggest that neural text classifiers can be bypassed, which could have substantial ramifications for existing policy approaches.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21461
Appears in Collections:Ph.D. Information Technology

Files in This Item:
File Description SizeFormat 
ASHISH BAJAJ Ph.D..pdf6.08 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.