Majority filter-based minority prediction (MFMP): An approach for unbalanced datasets

For many data mining and machine learning applications predicting minority class samples from skewed unbalanced data sets is a crucial problem. To address this problem, we propose a majority filter-based minority prediction (MFMP) approach for unbalanced datasets. The MFMP adopts an unsupervised lea...

Full description

Saved in:
Bibliographic Details
Published in:TENCON 2008 - 2008 IEEE Region 10 Conference pp. 1 - 6
Main Authors: Padmaja, T.M., Krishna, P.R., Bapi, R.S.
Format: Conference Proceeding
Language:English
Published: IEEE 01-11-2008
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:For many data mining and machine learning applications predicting minority class samples from skewed unbalanced data sets is a crucial problem. To address this problem, we propose a majority filter-based minority prediction (MFMP) approach for unbalanced datasets. The MFMP adopts an unsupervised learning technique for selecting samples for supervised learning. The approach is based on two steps. In the first-step, minority samples are clustered and majority class samples that are out of minority classification regions are identified. This improves minority prediction rate. In the second step majority samples are randomly selected in individual clusters and this enhances majority prediction rate. Experimentally we studied the behavior of MFMP approach and compared with the traditional random under-sampling approach on a synthetic data set and three UCI repository datasets using the following classifiers: decision tree, k-nearest neighbor, Naive Bayes and Radial basis function network. Precision, Recall and F-Measure are used for evaluating performance of classifiers. The experimental evidence suggests that MFMP approach exhibits good prediction rates over minority and majority classes on all classifiers. Furthermore, the proposed approach outperforms the traditional random under-sampling approach. MFMP applied on the decision tree gave better prediction as compared to other classifiers studied.
ISBN:1424424089
9781424424085
ISSN:2159-3442
2159-3450
DOI:10.1109/TENCON.2008.4766705