Please use this identifier to cite or link to this item: http://dspace.dtu.ac.in:8080/jspui/handle/repository/19009
Title: COMPARATIVE ANALYSIS OF FILTER FEATURE SELECTION ALGORITHMS FOR BUG PREDICTION USING MULTIPLE CLASSIFIERS
Authors: ARORA, HARSHIT
Keywords: ALGORITHMS
MULTIPLE CLASSIFIERS
BUG PREDICTION
Issue Date: May-2021
Series/Report no.: TD-5591;
Abstract: Software is a set of instructions with the sole purpose of defining functionality. It helps to make our life easier by doing heavy computation. That is precisely why developing software has become essential in the modern day. Many researchers are working on bug proneness of software using different approaches from manual testing to automation. In automation, Machine Learning algorithms are used to detect any flaw in the software. Their results vary from dataset to dataset. These algorithms give inconsistent output for predicting bugs in a random software project. Software Bug Prediction is the process of classifying a new module as buggy or not using some historical data. Using Software bug prediction, the number of modules to be tested decreases drastically. As software size increases daily, developing a classification model becomes challenging due to the massive amount of data to be processed. To that end, feature selection can be used to reduce the dimensionality of data. Feature selection is the process of reducing the feature space of a system under observation by using the evaluation criteria to select N relevant features from the original set. This reduced feature set helps in increasing the accuracy as well as the throughput of the models. In this study, we analyzed the prediction performance of various classifiers based on multiple ranked search-based feature selection algorithms (filter algorithms). In other terms, we can say that all the different feature selection algorithms are used with each classifier to check the model's prediction power. We have used Naive Bayes classifiers, Logistic Regression, K-Nearest Neighbours, Bayesian Network, Random Forest, Decision Tree, MultiLayer Perceptron and four types of ensemble classifier (Voting, Stacking, Bagging and Boosting) for implementation, and data sets are collected from the PROMISE repository, which is publicly available. The area under the ROC curve (AUC) is used to analyze the prediction performance. Friedman test is used, To check the statistical significance of the results of the different models. This study shows that the Feature Selection Algorithms improve the performance of SBP Models, and Correlation Attribute Feature Evaluation and Symmetrical Uncertainty gives effective results. Also, Bagging Ensemble gives the best results in all classifiers studied.
URI: http://dspace.dtu.ac.in:8080/jspui/handle/repository/19009
Appears in Collections:M.E./M.Tech. Computer Engineering

Files in This Item:
File Description SizeFormat 
harshit arora M.Tech.pdf1.42 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.