Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/20910
Title: | COMPARATIVE ANALYSIS OF WORD EMBEDDING TECHNIQUES ON SOFTWARE DEFECT PREDICTION |
Authors: | SHARMA, GAURAV |
Keywords: | EMBEDDING TECHNIQUES SOFTWARE DEFECT PREDICTION Doc2Vec TF-IDF Word2Vec |
Issue Date: | May-2024 |
Series/Report no.: | TD-7445; |
Abstract: | Embeddings are known for their ability to understand semantic relationships, reduce dimensionality, and identify patterns in data. These techniques are mostly used in machine learning as they are helpful and can easily be integrated into prediction models. Embedding techniques such as Word2Vec, TF-IDF, FastText, and Doc2Vec are commonly used for software defect prediction tasks. While creating a defect prediction model, picking the suitable embedding method is very important. This study undertakes a comprehensive comparison of these widely-used embedding techniques within the realm of software defect prediction. The analysis is based on a diverse set of Java projects sourced from the open-source Promise repository. The evaluation process involved training and testing multiple deep learning models to assess the effectiveness of each embedding technique. Several key evaluation metrics, including the Matthews correlation coefficient (MCC), specificity accuracy, precision, recall, and F1 score, were used to measure performance. The results of this rigorous evaluation reveal that Doc2Vec significantly outperforms the other embedding techniques, demonstrating its superiority in capturing semantic nuances and contributing to more accurate defect predictions. FastText emerges as the second-best performer, surpassing TF-IDF and Word2Vec in various metrics. TF-IDF, while effective, falls short of the performance levels achieved by Doc2Vec and FastText, but still surpasses Word2Vec, which ranks last in this comparison. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/20910 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
GAURAV SHARMA M.Tech..pdf | 5.19 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.