An empirical analysis of different sparse penalties for autoencoder in unsupervised feature learning

Machine learning algorithms depend heavily on the data representation, which dominates its success in experiment accuracy. Autoencoder model structure is proposed to learn from data a good representation with the least possible amount of distortion. Furthermore, it has been proven that boosting spar...

Full description

Saved in:

Bibliographic Details
Published in:	2015 International Joint Conference on Neural Networks (IJCNN) pp. 1 - 8
Main Authors:	Jiang, Nan, Rong, Wenge, Peng, Baolin, Nie, Yifan, Xiong, Zhang
Format:	Conference Proceeding Journal Article
Language:	English
Published:	IEEE 01-07-2015
Subjects:	Accuracy Algorithms autoencoder Classification Learning Mathematical analysis Neural networks Norms Representations sparse coding sparse penalty unsupervised feature learning Vectors (mathematics)
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Machine learning algorithms depend heavily on the data representation, which dominates its success in experiment accuracy. Autoencoder model structure is proposed to learn from data a good representation with the least possible amount of distortion. Furthermore, it has been proven that boosting sparsity when learning representation can significantly improve performance on classification tasks and also make the feature vector easy to interpret. One straightforward approach for autoencoder to obtain sparse representation is to impose sparse penalty on its overall cost function. Nevertheless, few comparative analysis has been conducted to evaluate which sparse penalty term works better. In this paper, we adopt L1 norm, L2 norm, Student-t penalties, which are rarely deployed to penalise the hidden unit outputs, and commonly used penalty KL-divergence in the literature. Then, we present a detailed analysis to evaluate which penalty achieves better result in terms of reconstruction error, sparseness of representation and classification performance on test datasets. Experimental study on MNIST, CIFAR-10, SVHN, OPTDIGITS and NORB datasets reveals that all these penalties achieve sparse representation and outperforms representations learned by pure autoencoder on classification performance and sparseness of feature vectors. Moreover, we hope this topics and the practices would provide insights for future research.
Bibliography:	ObjectType-Article-2 SourceType-Scholarly Journals-1 ObjectType-Conference-1 ObjectType-Feature-3 content type line 23 SourceType-Conference Papers & Proceedings-2
ISSN:	2161-4393 2161-4407
DOI:	10.1109/IJCNN.2015.7280568