LEGAL LINGUISTICS AND GENDER BIAS: STUDY OF LEGAL LANGUAGE MODELS AND DEBIASING STRATEGIES

GHOSH, TRISHA

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20770

Title:	LEGAL LINGUISTICS AND GENDER BIAS: STUDY OF LEGAL LANGUAGE MODELS AND DEBIASING STRATEGIES
Authors:	GHOSH, TRISHA
Keywords:	LEGAL LINGUISTICS TRANSFORMER MODELS INDIAN LEGAL SYSTEM BIAS MITIGATION BERT XLNET ROBERTA DEBERTA ILDC DATASET LAW2VEC EMBEDDINGS
Issue Date:	May-2024
Series/Report no.:	TD-7288;
Abstract:	This thesis delves into the intersection of legal linguistics and gender bias, employing natural language processing (NLP) techniques for legal judgment prediction using transformer-based models. The research is centered on the Indian legal context, focusing on the performance and bias mitigation of various models, including BERT, XLNet, RoBERTa, DeBERTa, ELECTRA, and BigBird, evaluated on the ILDC single dataset. The analysis requires a complex legal language, characterized by keywords and archetypes, which pose significant challenges to traditional NLP models. Working in the Common Law system, Indian judiciary is characterized by complex and outdated legal documents, which lead to ambiguities and inconsistencies. These challenges require advanced NLP models to deal with the nuances of legal text. Transfomer-based models, with their self-focusing methods, offer promising solutions for providing comprehensive analysis of legal complexity. An important aspect of this thesis is to examine gender bias in legal proceedings, an important issue that affects the fairness of judicial outcomes. Gender bias manifests itself in a variety of ways, including discriminatory language contained in legal documents as well as biased judicial decisions. This study examines this bias with Law2Vec embeddings and proposes strategies to overcome and reduce bias. Methods such as projection into gender subspace, k-means clustering are used to measure bias, followed by Hard Debiasing algorithm to reduce it. The effectiveness of bias is evaluated through court decision prediction tasks to ensure semantic retention integrity in embeddings. The method involves applying six transformer-based models to the ILDC-single dataset, comparing their performance to predict court decisions BigBird-RoBERTa demonstrated superior performance at about 80% accuracy, which drew attention to its ability to process long sequences and extract relevant contexts. The study also includes a detailed analysis of Law2Vec embeddings, which identifies the gender bias and effectively reduces it. Despite the promising results, the thesis acknowledges the limitations of the current models, in particular the lack of explanation. The non-transparent decision-making v processes in these models pose challenges for regulatory professionals who need transparent and interpretable information. Future research will focus on integrating descriptive methods and extending the data sets to increase the robustness and generalizability of the models. The findings highlight the importance of prior domain-specific training and the need for continued efforts to address biases in legal NLP applications. With transformer architectures advanced with the use of advanced bias methods, this research can help in the development of more ethical, transparent, and effective AI tools for legal services, increasing its efficiency in the process. In conclusion, this thesis demonstrates the effectiveness of transformer-based models in predicting legal decision making and the critical importance of reducing gender bias in legal language models. The study contributes to future developments in legal NLP and provides a uniform and transparent judgement process.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/20770
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
TRISHA GHOSH M.Tech.pdf		3.17 MB	Adobe PDF	View/Open

Show full item record