TOWARDS ETHICAL VISUAL REPRESENTATION: INVESTIGATING BIAS MITIGATION IN TEXT-TO-IMAGE GENERATION MODELS

PRERAK, SHAH

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets

Learn More

Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20830

Title:	TOWARDS ETHICAL VISUAL REPRESENTATION: INVESTIGATING BIAS MITIGATION IN TEXT-TO-IMAGE GENERATION MODELS
Authors:	PRERAK, SHAH
Keywords:	ETHICAL VISUAL BIAS MITIGATION TEXT-TO-IMAGE GENERATION MODELS
Issue Date:	May-2024
Series/Report no.:	TD-7359;
Abstract:	Recent developments in Text-to-Image generation models have had a wild impact on many diverse fields, from automatic synthesis of images based on textual descriptions to the abilities of media creation, digital marketing, or generation, which a decade ago seemed impossible. It has also been documented that most of these models are predisposed to produce biased results, for instance, gender bias, cultural bias, age related bias, and racial (skin tone) bias are predisposed to produce unrepresentative and skewed results of images. Biased results carry out the high payment; it may result in the continued reinforcement of negative stereotypes or create room for the perpetuation of social inequalities. A case in point that has greatly been commented on is the "GEMINI" incident. It was evident from the damage and controversy caused by the inciting AI-generated content that this bias in AI systems should be checked in time to avert all these unpleasant consequences for society. Too much work has done into methods of bias estimation, and with that in mind, we delved into what is typically the inherent understanding of how these biases show for Text-to-image models during evaluation. The evaluation of biases in an AI model can be carried out in very many ways, and the general lack of such tailor-made bias evaluation is one of the needs for more specific approaches with Text-to-image models. With the same experiments and our nice refinements to the prompt, we improved generated image fairness to unprecedented levels. We could demonstrate the possibility of curating this dataset by ethically enhancing the prompt and showed that, with careful editing, the output images get much more inclusive and diverse. A prompt that changes from "a photo of a doctor" to "a photo of doctors, reflecting wide backgrounds on gender, skin tone, and age" represents more fairly and more balancedly the produced images. We would want to propose that in the future, we may apply powerful machine learning to automatically identify biased cues and subsequently make the required modifications. This thesis offers a roadmap toward fairer and more inclusive T2I generation technologies while highlighting the significance of tackling biases in AI-generated content.
URI:	http://dspace.dtu.ac.in:8080/jspui/handle/repository/20830
Appears in Collections:	M.E./M.Tech. Computer Engineering

Files in This Item:

File	Description	Size	Format
SHAH PRERAK M.Tech..pdf		1.87 MB	Adobe PDF	View/Open

Show full item record