Incorporating adaptive discretization into genetic programming for data classification

Genetic programming (GP) for data classification using decision trees has been successful in creating models which obtain high classification accuracies. When categorical data is used GP is able to directly use decision trees to create models, however when the data contains continuous attributes dis...

Full description

Saved in:
Bibliographic Details
Published in:2013 Third World Congress on Information and Communication Technologies (WICT 2013) pp. 127 - 133
Main Authors: Dufourq, Emmanuel, Pillay, Nelishia
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2013
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Genetic programming (GP) for data classification using decision trees has been successful in creating models which obtain high classification accuracies. When categorical data is used GP is able to directly use decision trees to create models, however when the data contains continuous attributes discretization is required as a pre-processing step prior to learning. There has been no attempt to incorporate the discretization mechanism into the GP algorithm and this serves as the rationale for this paper. This paper proposes an adaptive discretization method for inclusion into the GP algorithm by randomly creating intervals during the execution of the algorithm through the use of a new genetic operator. This proposed approach was tested on five data sets and serves as an initial attempt at dynamically altering the intervals of GP decision trees while simultaneously searching for an optimal solution during the learning phase. The proposed method performs well when compared to other non-GP adaptive methods.
DOI:10.1109/WICT.2013.7113123