The Role of the Information Bottleneck in Representation Learning

A grand challenge in representation learning is the development of computational algorithms that learn the different explanatory factors of variation behind high-dimensional data. Encoder models are usually determined to optimize performance on training data when the real objective is to generalize...

Full description

Saved in:
Bibliographic Details
Published in:2018 IEEE International Symposium on Information Theory (ISIT) pp. 1580 - 1584
Main Authors: Vera, Matias, Piantanida, Pablo, Vega, Leonardo Rey
Format: Conference Proceeding
Language:English
Published: IEEE 01-06-2018
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:A grand challenge in representation learning is the development of computational algorithms that learn the different explanatory factors of variation behind high-dimensional data. Encoder models are usually determined to optimize performance on training data when the real objective is to generalize well to other (unseen) data. Although numerical evidence suggests that noise injection at the level of representations might improve the generalization ability of the resulting encoders, an information-theoretic justification of this principle remains elusive. In this work, we derive an upper bound to the so-called generalization gap corresponding to the cross-entropy loss and show that when this bound times a suitable multiplier and the empirical risk are minimized jointly, the problem is equivalent to optimizing the Information Bottleneck objective with respect to the empirical data-distribution. We specialize our general conclusions to analyze the dropout regularization method in deep neural networks, explaining how this regularizer helps to decrease the generalization gap.
ISSN:2157-8117
DOI:10.1109/ISIT.2018.8437679