![]() To achieve this task of SDP, machine learning (ML) techniques have been used by several researchers from the past two decades. With the increasing complexity of software, early prediction of defects, and assurance of good software quality of projects become difficult tasks. Effective models can be generated using the object-oriented (OO) metrics. Source code metrics give useful insights to software quality attributes like cohesion, coupling, size, inheritance, etc, and are extensively used in developing software defect models ( Basili, Briand & Melo, 1996 Singh, Kaur & Malhotra, 2010 Radjenović et al., 2013). Efficient defect prediction helps in the timely identification of areas in software that can lead to defects in software owing to better resource utilization ( Malhotra, 2016). Software Defect Prediction (SDP) deals with uncovering the probable future defects. The performances of oversampling methods are superior to undersampling methods. The study provides a guideline for identifying metrics that are influential for SDP. Random oversampling portrays the best predictive capability of developed defect prediction models. Statistical results advocate the use of resampling methods to improve SDP. The performances of developed models are analyzed using AUC, GMean, Balance, and sensitivity. The impact of 10 resampling methods is analyzed on selected features of 12 object-oriented Apache datasets using 15 machine learning techniques. This study aims at (1) identification of useful metrics in the software using correlation feature selection, (2) extensive comparative analysis of 10 resampling methods to generate effective machine learning models for imbalanced data, (3) inclusion of stable performance evaluators-AUC, GMean, and Balance and (4) integration of statistical validation of results. In addition to this large number of software metrics degrades the model performance. Models trained on imbalanced data leads to inaccurate future predictions owing to biased learning and ineffective defect prediction. Statistics of many defect-related open-source data sets depict the class imbalance problem in object-oriented projects. The development of correct and effective software defect prediction (SDP) models is one of the utmost needs of the software industry. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
March 2023
Categories |