Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20910
Title: COMPARATIVE ANALYSIS OF WORD EMBEDDING TECHNIQUES ON SOFTWARE DEFECT PREDICTION
Authors: SHARMA, GAURAV
Keywords: EMBEDDING TECHNIQUES
SOFTWARE DEFECT PREDICTION
Doc2Vec
TF-IDF
Word2Vec
Issue Date: May-2024
Series/Report no.: TD-7445;
Abstract: Embeddings are known for their ability to understand semantic relationships, reduce dimensionality, and identify patterns in data. These techniques are mostly used in machine learning as they are helpful and can easily be integrated into prediction models. Embedding techniques such as Word2Vec, TF-IDF, FastText, and Doc2Vec are commonly used for software defect prediction tasks. While creating a defect prediction model, picking the suitable embedding method is very important. This study undertakes a comprehensive comparison of these widely-used embedding techniques within the realm of software defect prediction. The analysis is based on a diverse set of Java projects sourced from the open-source Promise repository. The evaluation process involved training and testing multiple deep learning models to assess the effectiveness of each embedding technique. Several key evaluation metrics, including the Matthews correlation coefficient (MCC), specificity accuracy, precision, recall, and F1 score, were used to measure performance. The results of this rigorous evaluation reveal that Doc2Vec significantly outperforms the other embedding techniques, demonstrating its superiority in capturing semantic nuances and contributing to more accurate defect predictions. FastText emerges as the second-best performer, surpassing TF-IDF and Word2Vec in various metrics. TF-IDF, while effective, falls short of the performance levels achieved by Doc2Vec and FastText, but still surpasses Word2Vec, which ranks last in this comparison.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/20910
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
GAURAV SHARMA M.Tech..pdf5.19 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.