An Improved SDA Based Defect Prediction Framework for Both Within-Project and Cross-Project Class-Imbalance Problems
Background. Solving the class-imbalance problem of within-project software defect prediction (SDP) is an important research topic. Although some class-imbalance learning methods have been presented, there exists room for improvement. For cross-project SDP, we found that the class-imbalanced source u...
Saved in:
Published in: | IEEE transactions on software engineering Vol. 43; no. 4; pp. 321 - 339 |
---|---|
Main Authors: | , , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
IEEE
01-04-2017
IEEE Computer Society |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Background. Solving the class-imbalance problem of within-project software defect prediction (SDP) is an important research topic. Although some class-imbalance learning methods have been presented, there exists room for improvement. For cross-project SDP, we found that the class-imbalanced source usually leads to misclassification of defective instances. However, only one work has paid attention to this cross-project class-imbalance problem. Objective. We aim to provide effective solutions for both within-project and cross-project class-imbalance problems. Method. Subclass discriminant analysis (SDA), an effective feature learning method, is introduced to solve the problems. It can learn features with more powerful classification ability from original metrics. For within-project prediction, we improve SDA for achieving balanced subclasses and propose the improved SDA (ISDA) approach. For cross-project prediction, we employ the semi-supervised transfer component analysis (SSTCA) method to make the distributions of source and target data consistent, and propose the SSTCA+ISDA prediction approach. Results. Extensive experiments on four widely used datasets indicate that: 1) ISDA-based solution performs better than other state-of-the-art methods for within-project class-imbalance problem; 2) SSTCA+ISDA proposed for cross-project class-imbalance problem significantly outperforms related methods. Conclusion. Within-project and cross-project class-imbalance problems greatly affect prediction performance, and we provide a unified and effective prediction framework for both problems. |
---|---|
ISSN: | 0098-5589 1939-3520 |
DOI: | 10.1109/TSE.2016.2597849 |