An empirical analysis of different sparse penalties for autoencoder in unsupervised feature learning
Machine learning algorithms depend heavily on the data representation, which dominates its success in experiment accuracy. Autoencoder model structure is proposed to learn from data a good representation with the least possible amount of distortion. Furthermore, it has been proven that boosting spar...
Saved in:
Published in: | 2015 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8 |
---|---|
Main Authors: | , , , , |
Format: | Conference Proceeding Journal Article |
Language: | English |
Published: |
IEEE
01-07-2015
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Machine learning algorithms depend heavily on the data representation, which dominates its success in experiment accuracy. Autoencoder model structure is proposed to learn from data a good representation with the least possible amount of distortion. Furthermore, it has been proven that boosting sparsity when learning representation can significantly improve performance on classification tasks and also make the feature vector easy to interpret. One straightforward approach for autoencoder to obtain sparse representation is to impose sparse penalty on its overall cost function. Nevertheless, few comparative analysis has been conducted to evaluate which sparse penalty term works better. In this paper, we adopt L1 norm, L2 norm, Student-t penalties, which are rarely deployed to penalise the hidden unit outputs, and commonly used penalty KL-divergence in the literature. Then, we present a detailed analysis to evaluate which penalty achieves better result in terms of reconstruction error, sparseness of representation and classification performance on test datasets. Experimental study on MNIST, CIFAR-10, SVHN, OPTDIGITS and NORB datasets reveals that all these penalties achieve sparse representation and outperforms representations learned by pure autoencoder on classification performance and sparseness of feature vectors. Moreover, we hope this topics and the practices would provide insights for future research. |
---|---|
Bibliography: | ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2 |
ISSN: | 2161-4393 2161-4407 |
DOI: | 10.1109/IJCNN.2015.7280568 |