<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <title>DSpace Collection:</title>
  <link rel="alternate" href="http://dspace.dtu.ac.in:8080/jspui/handle/123456789/51" />
  <subtitle />
  <id>http://dspace.dtu.ac.in:8080/jspui/handle/123456789/51</id>
  <updated>2026-07-01T06:01:15Z</updated>
  <dc:date>2026-07-01T06:01:15Z</dc:date>
  <entry>
    <title>SOCIAL BIAS IDENTIFICATION AND  MITIGATION IN NATURAL LANGUAGE  TEXT USING MACHINE LEARNING</title>
    <link rel="alternate" href="http://dspace.dtu.ac.in:8080/jspui/handle/repository/22944" />
    <author>
      <name>KAMBOJ, PRADEEP</name>
    </author>
    <author>
      <name>KUMAR, SHAILENDER ( SUPERVISOR)</name>
    </author>
    <author>
      <name>GOYAL, VIKRAM (CO - SUPERVISOR )</name>
    </author>
    <id>http://dspace.dtu.ac.in:8080/jspui/handle/repository/22944</id>
    <updated>2026-06-25T05:08:43Z</updated>
    <published>2026-04-01T00:00:00Z</published>
    <summary type="text">Title: SOCIAL BIAS IDENTIFICATION AND  MITIGATION IN NATURAL LANGUAGE  TEXT USING MACHINE LEARNING
Authors: KAMBOJ, PRADEEP; KUMAR, SHAILENDER ( SUPERVISOR); GOYAL, VIKRAM (CO - SUPERVISOR )
Abstract: Advanced Artificial Intelligence (AI) methods have enabled the creation of &#xD;
sophisticated large language models (LLMs) capable of generating human-like text &#xD;
and handling a broad spectrum of complex language comprehension tasks. The last &#xD;
decade has seen the advent of LLMs that fill crucial roles across a variety of &#xD;
applications, including automated content generation and summarization, healthcare &#xD;
analytics, legal decision support, conversational agents, and educational technologies. &#xD;
Despite their remarkable abilities, these models often reflect and even amplify the &#xD;
social biases embedded in the large datasets on which they are trained. These biases &#xD;
can manifest as stereotypes or unjust associations related to gender, race, religion, &#xD;
profession, or other social features. When these AI systems are deployed in high-stakes &#xD;
domains where fairness and reliability are paramount, the presence of such biases &#xD;
raises major ethical, social, and technical concerns. As a result, understanding, &#xD;
measuring, and mitigating bias in LLMs has emerged as a prominent research &#xD;
challenge at the forefront of responsible and trustworthy AI. &#xD;
This thesis constitutes a thorough exploration of social bias in natural language text &#xD;
generated by language models (LMs) and LLMs, with a focus on systematic &#xD;
approaches to measuring, evaluating, and mitigating it. The research draws on &#xD;
theoretical, empirical, experimental, and methodological approaches to investigate &#xD;
bias from several angles across the AI pipeline, including word embeddings, &#xD;
contextualized language models, prompt-based inference functions, and fine-tuning &#xD;
strategies. The work focuses on understanding biases across these components and &#xD;
seeks practical solutions to build fairer and more trustworthy generative AI systems. &#xD;
The initial phase of the research investigates gender bias in contextualized word &#xD;
embeddings generated by transformer-based LMs. Word embeddings are the building &#xD;
blocks of language in many NLP systems, and biases encoded in these representations &#xD;
can carry over to downstream applications. The gender direction in the embedding &#xD;
space is extracted, and the gender polarity of profession-related terms (occupation &#xD;
names) with respect to gendered pronouns is calculated, yielding a quantitative &#xD;
framework for measuring one type of bias: that women or men are less likely to pursue &#xD;
certain professions. Indeed, an experimental analysis shows that dynamic embeddings &#xD;
from transformer-based models exhibit substantial gender associations even in the &#xD;
absence of explicit gender information in the input text. To alleviate this problem, we &#xD;
propose a form of post-processing debiasing that modifies the embedding &#xD;
representations to reduce stereotypical associations while preserving the semantic &#xD;
relationships among words. The experimental results show that the proposed method &#xD;
can significantly alleviate gender bias in profession embeddings, thereby balancing the &#xD;
model’s representations. &#xD;
Building on this foundation, the thesis broadens the analysis to large language models &#xD;
and a wider range of societal biases stemming from multiple demographic attributes. &#xD;
We introduce a systematic evaluation framework for bias in LLM-generated outputs, &#xD;
in part by creating a curated inference dataset from previously established bias &#xD;
benchmarks. The dataset includes contexts that encourage language models to generate &#xD;
stereotypical, anti-stereotypical, and neutral responses, enabling systematic &#xD;
assessment of model behaviour. This study provides a comprehensive mechanism for &#xD;
v   &#xD;
analyzing how different models respond to socially sensitive contexts and how bias &#xD;
manifests in generated text. &#xD;
This research makes an important contribution by exploring prompt engineering to &#xD;
both detect and mitigate bias in LLMs. Several types of prompt variants are developed &#xD;
to investigate the effects of their design on model behaviour, namely standard, chain&#xD;
of-thought, cognitive-style, and human-persona prompts. These prompts are &#xD;
systematically assessed to study the effects of various prompting techniques on output &#xD;
bias. Also proposed are the debiased versions of these prompts that explicitly elicit &#xD;
neutral reasoning and unbiased decision-making. &#xD;
The introduction of prompt-only bias evaluation is a key aspect of the extended work, &#xD;
exploring whether biased responses can be induced by prompts alone, without context. &#xD;
Experimental results indicate that when certain prompts are presented to language &#xD;
models, those models make stereotypical predictions, suggesting that bias arises from &#xD;
the interaction between prompts and the models' reasoning mechanisms, rather than &#xD;
solely from the training data. This underlined the importance of careful prompt design &#xD;
and evaluation when deploying language models in real-world settings. Alongside this &#xD;
bias analysis, the research also delves into the issue of hallucination in LLMs, whereby &#xD;
a model provides confident answers that are factually incorrect or unsupported. Across &#xD;
most domains, hallucinations undermine the model’s reliability and may introduce &#xD;
risks in critical domains such as healthcare, legal advice, and policy analysis. To tackle &#xD;
this phenomenon, the thesis presents a contrastive decoding method powered by &#xD;
disturb prompts to compare the probability distributions of model outputs for same&#xD;
prompt and perturbation-prompt scenarios. The method helps detect hallucinated &#xD;
content and enhances the factual consistency of outputs by comparing responses to &#xD;
normal prompts with those to perturbed prompts. The results show that contrastive &#xD;
prompting methods can mitigate hallucination and improve the robustness of language &#xD;
model outputs. &#xD;
Another important aspect of the research is assessing how well fine-tuning approaches &#xD;
mitigate biases. Among such models, large open-source language models are fine&#xD;
tuned on balanced sets with equal numbers of biased/unbiased statements across a wide &#xD;
range of social categories. Fine-tuning is when models are trained to produce more &#xD;
neutral and fair responses while retaining their language comprehension. In fact, &#xD;
experimental results show that fine-tuning with fairness-aware special prompts &#xD;
significantly reduces the model's biased outputs and improves fairness performance. &#xD;
In conclusion, the work in this thesis demonstrates that bias in LMs is a complex, &#xD;
multifaceted phenomenon with multiple underlying sources, including training data, &#xD;
representation learning, and prompting. Tackling this challenge requires the integrated &#xD;
use of bias measurement, dataset design, prompt engineering, model fine-tuning, and &#xD;
evaluation metrics. The methodologies are cross-disciplinary, offering actionable tools &#xD;
to identify and prevent bias in generative AI systems without sacrificing performance &#xD;
or usability. &#xD;
This work extends beyond technical contributions, establishing the need for a broader &#xD;
meaning of fair and responsible development in the internalization of AI. Overall, this &#xD;
thesis gives a good overview of bias in LMs and LLMs. The research, by integrating &#xD;
representation-level analysis, prompt-based evaluation, hallucination detection, and &#xD;
fairness-aware fine-tuning, provides novel insights into the mechanisms that produce &#xD;
vi   &#xD;
biases in AI systems while suggesting appropriate strategies to mitigate them. The &#xD;
results of this work demonstrate the potential to help establish more ethical, fair, &#xD;
transparent, and socially responsible generative AI technologies that can serve a wider &#xD;
range of communities without perpetuating harmful stereotypes or obesity-related &#xD;
inequalities.</summary>
    <dc:date>2026-04-01T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>EEG SIGNAL CLASSIFICATION USING FEW-SHOT LEARNING</title>
    <link rel="alternate" href="http://dspace.dtu.ac.in:8080/jspui/handle/repository/22764" />
    <author>
      <name>AHUJA, CHIRAG</name>
    </author>
    <id>http://dspace.dtu.ac.in:8080/jspui/handle/repository/22764</id>
    <updated>2026-06-08T05:46:01Z</updated>
    <published>2026-02-01T00:00:00Z</published>
    <summary type="text">Title: EEG SIGNAL CLASSIFICATION USING FEW-SHOT LEARNING
Authors: AHUJA, CHIRAG
Abstract: Electroencephalogram (EEG) signals are crucial in various applications, including&#xD;
Motor Imagery, Emotion Recognition, Visual Evoked Potentials, and Mental Workload&#xD;
assessment. However, EEG classification remains challenging due to limited labelled&#xD;
data, high noise levels, and substantial inter- and intra-subject variability. This thesis&#xD;
addresses these challenges by leveraging Few-Shot Learning (FSL) techniques to enable&#xD;
e!ective learning from minimal data for EEG signal classification.&#xD;
To overcome key limitations, this research integrates Data Augmentation, Transfer&#xD;
Learning, and Self-Supervised Learning (SSL) within the FSL framework. Specifically,&#xD;
it focuses on (1) developing EEG-specific data augmentation strategies to mitigate data&#xD;
scarcity, (2) designing transfer learning methodology to facilitate e”cient knowledge&#xD;
transfer across subjects, and (3) formulating SSL methods to enhance FSL with minimal&#xD;
labelled data.&#xD;
Firstly, the thesis presents a comprehensive literature review of FSL techniques&#xD;
in EEG classification, detailing data augmentation, transfer learning, and SSL&#xD;
methodologies. It establishes best practices for FSL for EEG classification and provides&#xD;
standardized guidelines for reporting results in future studies.&#xD;
Secondly, it explores data augmentation techniques to reduce dependence on&#xD;
limited EEG datasets by generating realistic augmented samples. It introduces Auto-&#xD;
Augmentation for Emotion Recognition in EEG - A Class and Subject Invariant&#xD;
Approach (ADAPTER) framework, which, when integrated with the cross-subject&#xD;
model Self-Organizing Graph Neural Network (SOGNN), achieves around 2% F1-&#xD;
score gain over vanilla SOGNN achieving 88.54% of cross-subject accuracy on SEED.&#xD;
Thirdly, recognizing the need for improved subject adaptation, the thesis proposes&#xD;
a novel framework called Transfer and Robust Adaptation of New Subjects in EEG&#xD;
vi&#xD;
Technology (TRANSIT-EEG). It combines a subject-specific data-augmentation -&#xD;
Individualised Denoising Probabilistic Model (IDPM) with Low-Rank Adaptation&#xD;
(LoRA) based transfer learning on an enhanced SOGNN model called Self-Organizing&#xD;
Graph Attention Transformer (SOGAT). Experimental evaluations on SEED and&#xD;
Phyaat datasets demonstrate superior cross-subject F1 scores of 91.53% and 87.78%,&#xD;
respectively.&#xD;
Finally, the work addresses cross-device generalization in EEG classification&#xD;
through two Self-Supervised Learning frameworks: (i) Self-Supervised Enhancement&#xD;
for Multidimensional Emotion Recognition using GNNs for EEG (SS-EMERGE) and&#xD;
(ii) Unified Framework for Yielding EEG-based Emotion Recognition Model with&#xD;
Self-Supervised Learning (UNIFY-ESSL). SS-EMERGE employs a multidimensional&#xD;
architecture to capture temporal, spectral, and spatial features. A meiosis-based&#xD;
data-augmentation pretext task drives cross-subject generalization. The model delivers&#xD;
Macro-F1 scores of 92.35% and 81.51% on SEED and SEED-IV, respectively. When&#xD;
fine-tuned with only half of the labels, it still achieves 86.13% and 76.75% on SEED&#xD;
and SEED-IV, respectively. UNIFY-ESSL evaluates Contrastive Learning (SimCLR)&#xD;
and Contrastive Predictive Coding (CPC) based pretext tasks alongside a proposed&#xD;
data sampling strategy. The experimental results show that SimCLR attains F1-&#xD;
scores of 82.62%, 87.83%, and 89.05% on SEED, DEAP, and DREAMER datasets,&#xD;
respectively, while CPC achieves 81.35%, 82.27%, and 91.23%. It improves cross-&#xD;
dataset generalization, with a 1-2% performance gain on DREAMER and maintained&#xD;
performance on DEAP despite channel reduction, although SEED experiences a 3%&#xD;
F1-score drop due to significant channel reduction.&#xD;
These contributions enable realistic data augmentation, rapid adaptation to new&#xD;
subjects for personalization, and unified modeling across datasets—advancing robust,&#xD;
adaptable, and generalizable EEG classification for diverse real-world applications.</summary>
    <dc:date>2026-02-01T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>DEVELOPMENT OF LINK PREDICTION MODEL IN SOCIAL NETWORK</title>
    <link rel="alternate" href="http://dspace.dtu.ac.in:8080/jspui/handle/repository/22763" />
    <author>
      <name>ZIYA, FATIMA</name>
    </author>
    <author>
      <name>Kumar, Sanjay (SUPERVISOR)</name>
    </author>
    <id>http://dspace.dtu.ac.in:8080/jspui/handle/repository/22763</id>
    <updated>2026-06-08T05:45:54Z</updated>
    <published>2026-02-01T00:00:00Z</published>
    <summary type="text">Title: DEVELOPMENT OF LINK PREDICTION MODEL IN SOCIAL NETWORK
Authors: ZIYA, FATIMA; Kumar, Sanjay (SUPERVISOR)
Abstract: Link prediction in social networks plays a crucial role in understanding network&#xD;
evolution, identifying potential interactions, and supporting applications such as rec-&#xD;
ommendation systems, community analysis, and the discovery of biological networks.&#xD;
The fundamental problem of link prediction is to estimate the likelihood of future or&#xD;
missing connections between pairs of nodes based on existing network information,&#xD;
structural patterns, node attributes, and temporal evolution. However, real-world net-&#xD;
works are highly complex, sparse, dynamic, and heterogeneous, making traditional&#xD;
similarity-based and shallow learning approaches insufficient to capture deep struc-&#xD;
tural semantics and evolving behavioral patterns.&#xD;
In this thesis, we introduce a robust and adaptive approach to link prediction in&#xD;
social networks. The present study integrates traditional similarity-based techniques&#xD;
with advanced deep music recommendations, among effective similarity scores ex-&#xD;
isting methods for list structure- and attribute-aware information, a single similarity&#xD;
index, or paths from performance and reliability of the proposed methodology.&#xD;
The first model, GSVAELP, introduces a hybrid GraphSAGE-VAE model that lever-&#xD;
ages local neighborhood aggregation with probabilistic latent-space embedding, suc-&#xD;
cessfully capturing both structural dependencies and latent relational patterns. This&#xD;
laid the foundation for robust structure-and-attribute-aware link prediction.&#xD;
The second study, MetaLP-DGI, introduced centrality-aware Deep Graph Infomax&#xD;
with meta-learning, enhancing embedding quality by incorporating influential node&#xD;
characteristics while improving generalization across heterogeneous networks.&#xD;
The third model, Hybrid Graph Embedding and Ensemble Learning, demonstrated&#xD;
that combining multiple embeddings with ensemble classifiers significantly improves&#xD;
predictive consistency and reduces model bias.&#xD;
vi&#xD;
Further, the fourth model enhancement is achieved through MetaLP-DGI, which&#xD;
utilizes Deep Graph Infomax (DGI) embeddings integrated with a centrality-aware&#xD;
transition matrix to capture both global and local structural dependencies. The meta-&#xD;
learning component in MetaLP-DGI optimizes the learning process across heteroge-&#xD;
neous datasets, improving robustness and adaptability. Complementing these in-depth&#xD;
approaches. The fifth study, Link Prediction in Social Networks: A Hybrid Approach&#xD;
with Graph Embedding and Ensemble Learning, combines structure- and attribute-&#xD;
based embeddings with ensemble classifiers, such as CatBoost and Random Forest, to&#xD;
deliver high-accuracy predictions in social network scenarios. Finally, the last study,&#xD;
UnifiedAttri2Vec–LSTM constructs a unified embedding by integrating multiple em-&#xD;
bedding algorithms through Attri2Vec and leverages LSTM to model temporal and&#xD;
structural dependencies simultaneously.&#xD;
Overall, this thesis contributes a comprehensive exploration of hybrid, generative,&#xD;
and meta-learning-based frameworks for link prediction, establishing a strong founda-&#xD;
tion for adaptive and scalable graph analytics. The progressive integration of centrality,&#xD;
attention, temporal evolution, and ensemble learning provides a unified roadmap for&#xD;
advancing intelligent link prediction in complex and dynamic networked systems.</summary>
    <dc:date>2026-02-01T00:00:00Z</dc:date>
  </entry>
  <entry>
    <title>DEVELOPMENT AND VALIDATION OF HYBRID ALGORITHMS FOR SOFTWARE DEFECT PREDICTION</title>
    <link rel="alternate" href="http://dspace.dtu.ac.in:8080/jspui/handle/repository/22754" />
    <author>
      <name>CHAWLA, SONALI</name>
    </author>
    <author>
      <name>MALHOTR, RUCHIKA (SUPERVISOR)</name>
    </author>
    <author>
      <name>SHARMA, ANJALI (CO-SUPERVISOR)</name>
    </author>
    <id>http://dspace.dtu.ac.in:8080/jspui/handle/repository/22754</id>
    <updated>2026-06-08T05:44:33Z</updated>
    <published>2026-03-01T00:00:00Z</published>
    <summary type="text">Title: DEVELOPMENT AND VALIDATION OF HYBRID ALGORITHMS FOR SOFTWARE DEFECT PREDICTION
Authors: CHAWLA, SONALI; MALHOTR, RUCHIKA (SUPERVISOR); SHARMA, ANJALI (CO-SUPERVISOR)
Abstract: Software defect prediction (SDP) is an important research subject aimed at improving&#xD;
the reliability, maintainability, and overall quality of software systems. The rapid&#xD;
development of software projects raises the need for robust and accurate predictive&#xD;
models. While traditional machine learning (ML) and statistical methods have shown&#xD;
promise for SDP, challenges like high-dimensional data, imbalanced data, inefficient&#xD;
feature selection, and model-tuning limitations persist. To overcome these limita-&#xD;
tions, this research focuses on the development and validation of hybrid algorithms&#xD;
that leverage the power of both machine learning and metaheuristic optimization&#xD;
techniques to improve predictive performance capabilities for SDP. The research is&#xD;
validated through systematic review, empirical studies, and the development of novel&#xD;
algorithms applicable in real-world software development environments.&#xD;
The research is systematically structured into phases, addressing distinct compo-&#xD;
nents of SDP. The initial phase involves a synthesis of a systematic literature review&#xD;
that seeks to evaluate the latest hybrid algorithms that enhance the predictive perfor-&#xD;
mance of SDP models and identify research gaps. The review develops a framework&#xD;
for analyzing the current state-of-the-art with respect to hybrid algorithms on multiple&#xD;
dimensions and highlights the gaps that this thesis will work to address. In subse-&#xD;
quent phases, the research develops and validates several novel hybrid algorithms&#xD;
using benchmark datasets from repositories such as NASA, PROMISE, and AEEEM.&#xD;
These later phases include addressing the prime issues of dataset imbalance, design-&#xD;
ing improved feature selection techniques, implementing hyper-parameter tuning,&#xD;
and evaluating the proposed hybrid models against established baseline methods to&#xD;
demonstrate their effectiveness in real-world software defect prediction scenarios.&#xD;
The high-dimensional software datasets greatly influence the efficiency and ac-&#xD;
curacy of predictive models. Feature selection plays a vital role in simplifying&#xD;
complex datasets while retaining the most significant information. A hybrid SDP&#xD;
model integrating Binary Particle Swarm Optimization (BPSO), Synthetic Minor-&#xD;
ity Oversampling Technique (SMOTE), and Artificial Neural Network (ANN) is&#xD;
proposed to improve software quality. One of the significant contributions of this&#xD;
research is the development of a hybrid defect prediction framework that integrates&#xD;
filter feature selection(Information Gain, Relief F, and Chi-square) and metaheuristic&#xD;
optimization(Opposition-based Whale Optimization Algorithm) for feature selec-&#xD;
tion with attention-based deep learning classifier- Convolutional Neural Networks&#xD;
(1Dimensional- CNN), to achieve higher classification performance. This model is&#xD;
particularly valuable when dealing with large datasets, complex feature interactions,&#xD;
and the need for balancing multiple objectives, such as maximizing classification&#xD;
performance while minimizing the number of features.&#xD;
Predictive models for SDP often underperform when using default configurations,&#xD;
highlighting the critical need for hyperparameter optimization in maximizing model&#xD;
effectiveness. In this research work, we employed advanced optimization techniques,&#xD;
specifically Grey Wolf Optimization (GWO) and Salp Swarm Optimization(SSO) algo-&#xD;
rithms, in combination with machine learning and ensemble classifiers to create more&#xD;
effective hybrid models for SDP. These nature-inspired techniques navigate complex&#xD;
parameter spaces to achieve an effective balance between exploration and exploitation&#xD;
in an optimization process. This study highlights that appropriate hyperparameter tun-&#xD;
ing can yield a significant performance improvement because each predictive model&#xD;
undergoes comprehensive testing for different combinations of parameters before the&#xD;
optimal parameters are reached for each predictive model.&#xD;
Based on the promising outcomes of the hybrid algorithms developed for defect&#xD;
prediction, we further investigate their effectiveness by evaluating various hybrid&#xD;
approaches across multiple datasets to ensure the model 's generalizability. The&#xD;
experimental results are favourable for the hybrid models, which outperform traditional&#xD;
ML and statistical defect prediction models. This superiority is evident across key&#xD;
performance metrics, like F1-score, AUC-ROC, Recall, Precision, G-mean, and MCC.&#xD;
Furthermore, rigorous statistical testing confirms the reliability and robustness of these&#xD;
advanced techniques, reinforcing their effectiveness in SDP.&#xD;
In conclusion, this research significantly progresses the field of SDP by address-&#xD;
ing key predictive modelling challenges through the development and validation of&#xD;
sophisticated hybrid techniques. The study strengthens the effectiveness, reliability,&#xD;
and real-world applicability of defect prediction models. This study offers innovative&#xD;
methods for enhancing software quality, which benefits both academia and industry.&#xD;
The insights generated from this research provide a foundation for future advance-&#xD;
ments in predictive modelling, which will eventually help create software systems that&#xD;
are more dependable, efficient, and free of flaws.</summary>
    <dc:date>2026-03-01T00:00:00Z</dc:date>
  </entry>
</feed>

