Please use this identifier to cite or link to this item:
http://dspace.dtu.ac.in:8080/jspui/handle/repository/15007
Title: | DEFECT SEVERITY PREDICTION USING TEXT MINING |
Authors: | SRIVASTAVA, YOGESH KUMAR |
Keywords: | SEVERITY PREDICTION TEXT MINING TEXT CLASSIFICATION |
Issue Date: | Aug-2016 |
Series/Report no.: | TD NO.1708; |
Abstract: | The objective of this thesis is to help in predicting the defects severity automatically. There are databases which are used for logging the defects during the testing phases. It is quite possible that one defect database which is working fine for one system or project may not work fine for the other project. It may also happen the one defect database may not work for multiple projects. Moreover the defects data get collected for a project lack in consistency. Although all projects are having a predefined set of data fields which were required but these data fields do not provide enough information where quality of the issues can be found and we can compare the projects. The main purpose of this project is to first develop a tool which helps in predicting the defect severity in testing process. Here training data is taken in the text file. It is important to reduce the unnecessary words and get the set of words by which the text classification can be done properly. There are lots of text mining techniques available which can be used to reduce the unnecessary words or we can say that words which are not helpful in the classification can be removed. There are lots of common words which are useless for classification can be removed and it is called as stop word removal. Before creating bag of words all occurrence of these types of words can be removed by stop word removal. Even with the Stop Word removal [5] it is not possible to get the required set of word which can be used for classification so reduce it further can be achieved by Stemming. Number of words present is document get analyzed in case of Stemming. The purpose of stemming to find out the set of words which can be treated as similar or equivalent words. Like we can say Defect Severity Prediction Using Text Mining │ x 'applied', 'applying', 'applies' and 'apply' are similar words. After stemming the term frequency timed inverse document frequency is calculated which is often denoted as "tf*idf". To simplify the target, InfoGain is applied to word based. After applying InfoGain these words can be used for classification. On these training data then we can apply the machine language so that it can learn the rules to predict the severity by finding the terms. |
URI: | http://dspace.dtu.ac.in:8080/jspui/handle/repository/15007 |
Appears in Collections: | M.E./M.Tech. Computer Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Text_Mining_Defect_prediction_Yogesh_Final1.pdf | 1.92 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.