Learning hierarchical representations for face verification with convolutional deep belief networks

Most modern face recognition systems rely on a feature representation given by a hand-crafted image descriptor, such as Local Binary Patterns (LBP), and achieve improved performance by combining several such representations. In this paper, we propose deep learning as a natural source for obtaining a...

Full description

Saved in:

Bibliographic Details
Published in:	2012 IEEE Conference on Computer Vision and Pattern Recognition pp. 2518 - 2525
Main Authors:	Huang, G. B., Honglak Lee, Learned-Miller, E.
Format:	Conference Proceeding
Language:	English
Published:	IEEE 01-06-2012
Subjects:	Accuracy Convolutional codes Face Face recognition Measurement Training Vectors
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Most modern face recognition systems rely on a feature representation given by a hand-crafted image descriptor, such as Local Binary Patterns (LBP), and achieve improved performance by combining several such representations. In this paper, we propose deep learning as a natural source for obtaining additional, complementary representations. To learn features in high-resolution images, we make use of convolutional deep belief networks. Moreover, to take advantage of global structure in an object class, we develop local convolutional restricted Boltzmann machines, a novel convolutional learning model that exploits the global structure by not assuming stationarity of features across the image, while maintaining scalability and robustness to small misalignments. We also present a novel application of deep learning to descriptors other than pixel intensity values, such as LBP. In addition, we compare performance of networks trained using unsupervised learning against networks with random filters, and empirically show that learning weights not only is necessary for obtaining good multilayer representations, but also provides robustness to the choice of the network architecture parameters. Finally, we show that a recognition system using only representations obtained from deep learning can achieve comparable accuracy with a system using a combination of hand-crafted image descriptors. Moreover, by combining these representations, we achieve state-of-the-art results on a real-world face verification database.
ISBN:	9781467312264 1467312266
ISSN:	1063-6919
DOI:	10.1109/CVPR.2012.6247968