Feature selection using particle swarm optimization-based logistic regression model

In any classification problem, the dataset typically has a large number of features. However, not all features are necessary to obtain a good classification performance because some of them are irrelevant and redundant. Therefore, classifiers with less number of features but with better classificati...

Full description

Saved in:
Bibliographic Details
Published in:Chemometrics and intelligent laboratory systems Vol. 182; pp. 41 - 46
Main Authors: Qasim, Omar Saber, Algamal, Zakariya Yahya
Format: Journal Article
Language:English
Published: Elsevier B.V 15-11-2018
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In any classification problem, the dataset typically has a large number of features. However, not all features are necessary to obtain a good classification performance because some of them are irrelevant and redundant. Therefore, classifiers with less number of features but with better classification accuracy are favored for ease of interpretation. In this work, particle swarm optimization algorithm along with logistic regression model is proposed. Additionally, the Bayesian information criterion (BIC) as a fitness function is proposed. The performance of different fitness functions is investigated and compared with BIC. The performance of the proposed method is evaluated based on a large number of different types of datasets. Experimental results using different types of datasets demonstrate the usefulness of our proposed method in significantly obtaining an improved classification performance with few features. Further, the results show that the proposed methods have a competitive performance comparing with other existing fitness functions. •We examined the performance of the proposed method, PSO-LRBIC, for descriptor selection in QSAR classification.•The PSO-LRBIC method has better performance than existing fitness functions.•The classification ability for the PSO-LRBIC method is quite high.
ISSN:0169-7439
1873-3239
DOI:10.1016/j.chemolab.2018.08.016