A systematic review of multi-label feature selection and a new method based on label construction

Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-labe...

Full description

Saved in:
Bibliographic Details
Published in:Neurocomputing (Amsterdam) Vol. 180; pp. 3 - 15
Main Authors: Spolaôr, Newton, Monard, Maria Carolina, Tsoumakas, Grigorios, Lee, Huei Diana
Format: Journal Article
Language:English
Published: Elsevier B.V 05-03-2016
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:Each example in a multi-label dataset is associated with multiple labels, which are often correlated. Learning from this data can be improved when dimensionality reduction tasks, such as feature selection, are applied. The standard approach for multi-label feature selection transforms the multi-label dataset into single-label datasets before using traditional feature selection algorithms. However, this approach often ignores label dependence. In this work, we propose an alternative method, LCFS, that constructs new labels based on relations between the original labels. By doing so, the label set from the data is augmented with second-order information before applying the standard approach. To assess LCFS, an experimental evaluation using Information Gain as a measure to estimate the importance of features was carried out on 10 benchmark multi-label datasets. This evaluation compared four LCFS settings with the standard approach, using random feature selection as a reference. For each dataset, the performance of a feature selection method is estimated by the quality of the classifiers built from the data described by the features selected by the method. The results show that a simple LCFS setting gave rise to classifiers similar to, or better than, the ones built using the standard approach. Furthermore, this work also pioneers the use of the systematic review method to survey the related work on multi-label feature selection. The summary of the 99 papers found promotes the idea that exploring label dependence during feature selection can lead to good results. •By constructing new labels, LCFS considers label relations from a multi-label dataset.•A LCFS setting achieved performance competitive with the standard approach.•LCFS contributed to outperform classifiers based on experimental references.•We also pioneer the systematic review use on multi-label feature selection literature.•The summary of 99 papers found evidence that agrees with LCFS achievements.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0925-2312
1872-8286
DOI:10.1016/j.neucom.2015.07.118