A preliminary study on the reuse of subtrees within decision trees in a genetic programming context for data classification

Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating functions is a common practice as this serves as a way to isolate a part of code which can be reused. The encapsulation genetic operator is capable...

Full description

Saved in:
Bibliographic Details
Published in:2013 Third World Congress on Information and Communication Technologies (WICT 2013) pp. 285 - 290
Main Authors: Dufourq, Emmanuel, Pillay, Nelishia
Format: Conference Proceeding
Language:English
Published: IEEE 01-12-2013
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Abstract Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating functions is a common practice as this serves as a way to isolate a part of code which can be reused. The encapsulation genetic operator is capable of promoting modularization in the sense that the operator can encapsulate subtrees which can be reused by GP trees during the execution of the algorithm. Models created for data classification problems tend to be large and of a certain complexity, and thus rendering the need for modular acquisition methods which promote the reuse of existing subtrees in order to solve the classification problems. The effect of the encapsulation operator for GP when solving data classification problems has not previously been investigated. Two approaches were proposed, the first incorporated the encapsulation operator with no limitations on how to use the encapsulated subtrees. The second approach made use of a maintained list of encapsulated subtrees. The two proposed methods were tested on eight data sets and the results show that the encapsulation operator improved the training accuracy on nearly every data set.
AbstractList Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating functions is a common practice as this serves as a way to isolate a part of code which can be reused. The encapsulation genetic operator is capable of promoting modularization in the sense that the operator can encapsulate subtrees which can be reused by GP trees during the execution of the algorithm. Models created for data classification problems tend to be large and of a certain complexity, and thus rendering the need for modular acquisition methods which promote the reuse of existing subtrees in order to solve the classification problems. The effect of the encapsulation operator for GP when solving data classification problems has not previously been investigated. Two approaches were proposed, the first incorporated the encapsulation operator with no limitations on how to use the encapsulated subtrees. The second approach made use of a maintained list of encapsulated subtrees. The two proposed methods were tested on eight data sets and the results show that the encapsulation operator improved the training accuracy on nearly every data set.
Author Dufourq, Emmanuel
Pillay, Nelishia
Author_xml – sequence: 1
  givenname: Emmanuel
  surname: Dufourq
  fullname: Dufourq, Emmanuel
  email: edufourq@gmail.com
  organization: Sch. of Math., Stat. & Comput. Sci., Univ. of KwaZulu-Natal, Natal, South Africa
– sequence: 2
  givenname: Nelishia
  surname: Pillay
  fullname: Pillay, Nelishia
  email: pillayn32@ukzn.ac.za
  organization: Sch. of Math., Stat. & Comput. Sci., Univ. of KwaZulu-Natal, Natal, South Africa
BookMark eNotkM1KQzEQhSPoQmsfQNzMC_Saub_JshR_CgU3FZdlmkzaQJtbkhQtvrzX2tXAxzkfnLkT16EPLMQDygJR6qfP-WxZlBKrokOssJFXYqw7hXWndVVWsr0VP1M4RN75vQ8UT5Dy0Z6gD5C3DJGPiaF3kI7rHJkTfPm89QEsG5_8X-pMB0Kw4cDZm8HWbyLtB98GTB8yf2dwfQRLmcDsKCXvvKE81O_FjaNd4vHljsTHy_Ny9jZZvL_OZ9PFxGMr84RbateqdEqhKrXmupSNQ9S1VYpKW5NDKQ3XhgZWO2NJWm60bm3jOnZUjcTjv9cz8-oQ_X6Yurq8pPoFTaldiQ
ContentType Conference Proceeding
DBID 6IE
6IL
CBEJK
RIE
RIL
DOI 10.1109/WICT.2013.7113150
DatabaseName IEEE Electronic Library (IEL) Conference Proceedings
IEEE Proceedings Order Plan All Online (POP All Online) 1998-present by volume
IEEE Xplore All Conference Proceedings
IEEE Electronic Library Online
IEEE Proceedings Order Plans (POP All) 1998-Present
DatabaseTitleList
Database_xml – sequence: 1
  dbid: RIE
  name: IEEE Electronic Library Online
  url: http://ieeexplore.ieee.org/Xplore/DynWel.jsp
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
EISBN 9781479932306
1479932302
EndPage 290
ExternalDocumentID 7113150
Genre orig-research
GroupedDBID 6IE
6IL
CBEJK
RIE
RIL
ID FETCH-LOGICAL-i160t-e6a6b82f8818299e4205f1194d88a2d4af100ce4ca1944fcda0de5996d5f7efa3
IEDL.DBID RIE
IngestDate Thu Jun 29 18:38:14 EDT 2023
IsPeerReviewed false
IsScholarly false
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-i160t-e6a6b82f8818299e4205f1194d88a2d4af100ce4ca1944fcda0de5996d5f7efa3
PageCount 6
ParticipantIDs ieee_primary_7113150
PublicationCentury 2000
PublicationDate 2013-Dec.
PublicationDateYYYYMMDD 2013-12-01
PublicationDate_xml – month: 12
  year: 2013
  text: 2013-Dec.
PublicationDecade 2010
PublicationTitle 2013 Third World Congress on Information and Communication Technologies (WICT 2013)
PublicationTitleAbbrev WICT
PublicationYear 2013
Publisher IEEE
Publisher_xml – name: IEEE
Score 1.6044754
Snippet Genetic programming (GP) has been successful in creating models for data classification which obtain high accuracies. In a programming context creating...
SourceID ieee
SourceType Publisher
StartPage 285
SubjectTerms data classification
data mining
Encapsulation
genetic programming
Genetics
Glass
Iris
Meteorology
optimization
Sonar
Title A preliminary study on the reuse of subtrees within decision trees in a genetic programming context for data classification
URI https://ieeexplore.ieee.org/document/7113150
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF5sT55UWvHNHDyaNo_N6yi1pV5EsKK3stmdlYIkJWkO4p93ZhMUwYu3ZVgS2An5Zna--UaI6yQ2llCBPBBi4UlL6U6e68hTykobS42F0y1YPqUPr9ndnGVybr57YRDRkc9wwktXyzeVbvmqbJoGQeQS9EGaZ12vVl-oDPx8-nI_WzFXK5r0-34NTHF4sTj435sOxfin8Q4evyHlSOxhORKft7Ct8d2N36o_wAnCQlUChW5QY9sgVBaatuD6cgN8sbopwfSzc6CzkkUBfSvcsgg9J4ue9wZMVaf_M1DsCswWBc3hNPOHnMvG4nkxX82WXj8zwdsEib_zMFFJkYU2IyAmpEEZ-rENglyaLFOhkcoGvq9RakU2abVRvkGWaDGxTdGq6FgMy6rEEwEUe2hKnENrUkqjCqVkmqGxmgCNNdmLUzHig1tvO1mMdX9mZ3-bz8U--6ZjglyI4a5u8VIMGtNeOUd-Ae4Ao9s
link.rule.ids 310,311,782,786,791,792,798,27934,54767
linkProvider IEEE
linkToHtml http://sdu.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1LS8NAEF60HvSk0opv5-DRtHlsXkeplhZrEazorWx2Z6UgSUmag_jnnd2EiuDF2zIsCeyEfDM733zD2HUUKk2oQB7wMXO4pnQnTWXgCKG5DrnEzOoWjJ_j2Vtyd29kcm42vTCIaMln2DdLW8tXhazNVdkg9rzAJug7IY-juOnWakuVnpsOXifDuWFrBf1256-RKRYxRvv_e9cB6_203sHTBlQO2RbmXfZ1C6sSP-wArvITrCQsFDlQ8AYl1hVCoaGqM1NhrsBcrS5zUO30HGisZBFAX4tpWoSWlUXPewdDVqc_NFD0CoYvCtIE1IZBZJ3WYy-j-_lw7LRTE5ylF7lrByMRZYmvE4Jiwhrkvhtqz0u5ShLhKy6057oSuRRk41oq4So0Ii0q1DFqERyxTl7keMyAog9JqbOvVUyJVCYEjxNUWhKkGVX27IR1zcEtVo0wxqI9s9O_zVdsdzx_nC6mk9nDGdszfmp4Ieessy5rvGDblaovrVO_AbbSpyw
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2013+Third+World+Congress+on+Information+and+Communication+Technologies+%28WICT+2013%29&rft.atitle=A+preliminary+study+on+the+reuse+of+subtrees+within+decision+trees+in+a+genetic+programming+context+for+data+classification&rft.au=Dufourq%2C+Emmanuel&rft.au=Pillay%2C+Nelishia&rft.date=2013-12-01&rft.pub=IEEE&rft.spage=285&rft.epage=290&rft_id=info:doi/10.1109%2FWICT.2013.7113150&rft.externalDocID=7113150