Optimal Boolean lattice-based algorithms for the U-curve optimization problem

•The U-curve problem can be used to model feature selection.•We pointed out an error in the first algorithm proposed to solve the U-curve problem.•We introduced UCS, an optimal algorithm for the U-curve problem.•We also provide UCSR, an algorithm for a special case of the U-curve problem.•UCS and UC...

Full description

Saved in:
Bibliographic Details
Published in:Information sciences Vol. 471; pp. 97 - 114
Main Authors: Reis, Marcelo S., Estrela, Gustavo, Ferreira, Carlos Eduardo, Barrera, Junior
Format: Journal Article
Language:English
Published: Elsevier Inc 01-01-2019
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:•The U-curve problem can be used to model feature selection.•We pointed out an error in the first algorithm proposed to solve the U-curve problem.•We introduced UCS, an optimal algorithm for the U-curve problem.•We also provide UCSR, an algorithm for a special case of the U-curve problem.•UCS and UCSR outperformed BFS, CHCGA and SFFS in feature selection experiments. The U-curve optimization problem is characterized by a decomposable in U-shaped curves cost function over the chains of a Boolean lattice. This problem can be applied to model the classical feature selection problem in Machine Learning. In this paper, we point out that the firstly proposed algorithm to tackle the U-curve problem, the RBM algorithm, is in fact suboptimal. We also present two new algorithms: UCS, which is actually optimal to tackle this problem; and UCSR, a variation of UCS that solves a special case of the U-curve problem and relies on a reduced, ordered binary decision diagram to control the search space. We provide results of two computational assays with these new algorithms: first, W-operator design for filtering of binary images; second, linear SVM design for classification of data sets from the UCI Machine Learning Repository. We show that, in these assays, UCS and UCSR outperformed an exhaustive search and also three widely used heuristics: the SFFS sequential selection, the BFS graph-based search, and the CHCGA genetic algorithm. Finally, we analyze the obtained results and point out improvements that might enhance the performance of these two novel algorithms.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2018.08.060