Accelerating projections to kernel-induced spaces by feature approximation
•A method for simplifying data projections onto kernel spaces is proposed.•Projection errors in simplified feature spaces are low for a set of standard datasets.•Two-orders of magnitude speed-up in data projection procedure can be obtained.•Classification accuracy drop for simplified features is sta...
Saved in:
Published in: | Pattern recognition letters Vol. 136; pp. 31 - 39 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Amsterdam
Elsevier B.V
01-08-2020
Elsevier Science Ltd |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | •A method for simplifying data projections onto kernel spaces is proposed.•Projection errors in simplified feature spaces are low for a set of standard datasets.•Two-orders of magnitude speed-up in data projection procedure can be obtained.•Classification accuracy drop for simplified features is statistically insignificant.
A method for speeding-up data projections onto kernel-induced feature spaces (derived using e.g. kernel Principal Component Analysis - kPCA) is presented in the paper. The proposed idea is to simplify the derived features, implicitly defined by all training samples and dominant eigenvectors of problem-specific generalized eigenproblems, by appropriate approximations. Instead of employing the whole training set, we propose to use a small pool of its appropriately selected representatives and we formulate a rule for deriving the corresponding weight vectors that replace the considered dominant eigenvectors. The representatives are determined via clustering of training data, whereas weighting coefficients are chosen to minimize original feature approximation errors. The concept has been experimentally verified for kernel-PCA using both artificial and real datasets. It has been shown that the presented approach provides reduction in feature-extraction complexity, which implies a proportional increase in data projection speed, by one-to-two orders of magnitude, without sacrificing data analysis accuracy. Therefore, the proposed approach is well-suited for kernel-based, intelligent data analysis applications that are to be executed on resource-limited systems, such as embedded or IoT devices, or for systems where processing time is critical. |
---|---|
ISSN: | 0167-8655 1872-7344 |
DOI: | 10.1016/j.patrec.2020.05.029 |