Tuning parameter estimation in SCAD-support vector machine using firefly algorithm with application in gene selection and cancer classification

In cancer classification, gene selection is one of the most important bioinformatics related topics. The selection of genes can be considered to be a variable selection problem, which aims to find a small subset of genes that has the most discriminative information for the classification target. The...

Full description

Saved in:
Bibliographic Details
Published in:Computers in biology and medicine Vol. 103; pp. 262 - 268
Main Authors: Al-Thanoon, Niam Abdulmunim, Qasim, Omar Saber, Algamal, Zakariya Yahya
Format: Journal Article
Language:English
Published: United States Elsevier Ltd 01-12-2018
Elsevier Limited
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In cancer classification, gene selection is one of the most important bioinformatics related topics. The selection of genes can be considered to be a variable selection problem, which aims to find a small subset of genes that has the most discriminative information for the classification target. The penalized support vector machine (PSVM) has proved its effectiveness at creating a strong classifier that combines the advantages of the support vector machine and penalization. PSVM with a smoothly clipped absolute deviation (SCAD) penalty is the most widely used method. However, the efficiency of PSVM with SCAD depends on choosing the appropriate tuning parameter involved in the SCAD penalty. In this paper, a firefly algorithm, which is a metaheuristic continuous algorithm, is proposed to determine the tuning parameter in PSVM with SCAD penalty. Our proposed algorithm can efficiently help to find the most relevant genes with high classification performance. The experimental results from four benchmark gene expression datasets show the superior performance of the proposed algorithm in terms of classification accuracy and the number of selected genes compared with competing methods. •The proposed method has better performance than the CV.•The classification ability for the proposed method is quite high.•The proposed method performed remarkably well in gene selection stability test.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:0010-4825
1879-0534
DOI:10.1016/j.compbiomed.2018.10.034