DESIGN OF FRAMEWORK FOR  ADVERSARIAL ATTACKS & DEFENCES IN  CLASSIFICATION MODELS

BAJAJ, ASHISH

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/21461

Title:	DESIGN OF FRAMEWORK FOR ADVERSARIAL ATTACKS & DEFENCES IN CLASSIFICATION MODELS
Authors:	BAJAJ, ASHISH
Keywords:	ADVERSARIAL ATTACKS & DEFENCES CLASSIFICATION MODEL DEEP LEARNING FRAMEWORK NLP
Issue Date:	Aug-2024
Series/Report no.:	TD-7795;
Abstract:	The advent of Deep Learning has enabled us to effectively train neural networks to handle intricate datasets with exceptional efficiency. Nevertheless, as research progresses, numerous weaknesses in neural networks have been revealed. Adversarial Machine Learning is a specific area of research that focuses on identifying and exploiting vulnerabilities in neural networks that lead to misclassification of input data that is very similar to the original data. Adversarial assaults refer to a category of methods designed to intentionally cause neural networks to misclassify data across multiple domains and tasks. Our comprehensive review of the extensive and growing research on adversarial attacks has revealed a notable dearth of research in the domain of NLP. This research presents a comprehensive examination of current textual adversarial attacks and their comprehension from many angles in the field of Natural Language Processing. We have created three innovative techniques for adversarial attacks on text, as well as a strategy for defending against such attacks. Additionally, we will end by examining the potential areas for future research in the domain of adversarial machine learning specifically in the textual realm. The investigation illustrates that linguistic frameworks have an inherent vulnerability to adversarial texts, where a few words or characters are altered to create perturbed text that misleads the machine into making incorrect predictions while preserving its intended meaning among human viewers. The present study introduces HOMOCHAR, Non-Alpha-Num & Inflect Text, novel approaches for attacking text that works at character and word level granularity in a situation where the inner workings of the system are unknown. The objective is to deceive a specific neural text classifier while following specified language limitations in a manner that makes the changes undetectable to humans. Extensive investigations are carried out to evaluate the viability of the proposed attack methodologies on various often utilized frameworks, inclusive of Word-CNN, Bi-LSTM, and various advanced transformer models across different benchmark text datasets: AG news, MR, IMDb, Yelp, etc which are commonly employed for text classification tasks. Experimental proof demonstrates that the suggested attack architectures regularly outperform conventional methods by achieving much higher attack success rates (ASR) & generating better adversarial examples. The findings suggest that neural text classifiers can be bypassed, which could have substantial ramifications for existing policy approaches.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/21461
Appears in Collections:	Ph.D. Information Technology

Files in This Item:

File	Description	Size	Format
ASHISH BAJAJ Ph.D..pdf		6.08 MB	Adobe PDF	View/Open

Show full item record