A preliminary study on the reuse of subtrees within decision trees in a genetic programming context for data classification
Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating functions is a common practice as this serves as a way to isolate a part of code which can be reused. The encapsulation genetic operator is capable...
Saved in:
Published in: | 2013 Third World Congress on Information and Communication Technologies (WICT 2013) pp. 285 - 290 |
---|---|
Main Authors: | , |
Format: | Conference Proceeding |
Language: | English |
Published: |
IEEE
01-12-2013
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating functions is a common practice as this serves as a way to isolate a part of code which can be reused. The encapsulation genetic operator is capable of promoting modularization in the sense that the operator can encapsulate subtrees which can be reused by GP trees during the execution of the algorithm. Models created for data classification problems tend to be large and of a certain complexity, and thus rendering the need for modular acquisition methods which promote the reuse of existing subtrees in order to solve the classification problems. The effect of the encapsulation operator for GP when solving data classification problems has not previously been investigated. Two approaches were proposed, the first incorporated the encapsulation operator with no limitations on how to use the encapsulated subtrees. The second approach made use of a maintained list of encapsulated subtrees. The two proposed methods were tested on eight data sets and the results show that the encapsulation operator improved the training accuracy on nearly every data set. |
---|---|
DOI: | 10.1109/WICT.2013.7113150 |