Contrasting Contrastive Self-Supervised Representation Learning Pipelines

In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much is yet to be understood about how different training methods and datasets influence performance on dow...

Full description

Saved in:
Bibliographic Details
Published in:2021 IEEE/CVF International Conference on Computer Vision (ICCV) pp. 9929 - 9939
Main Authors: Kotar, Klemen, Ilharco, Gabriel, Schmidt, Ludwig, Ehsani, Kiana, Mottaghi, Roozbeh
Format: Conference Proceeding
Language:English
Published: IEEE 01-10-2021
Subjects:
Online Access:Get full text
Tags: Add Tag
No Tags, Be the first to tag this record!
Description
Summary:In the past few years, we have witnessed remarkable breakthroughs in self-supervised representation learning. Despite the success and adoption of representations learned through this paradigm, much is yet to be understood about how different training methods and datasets influence performance on downstream tasks. In this paper, we analyze contrastive approaches as one of the most successful and popular variants of self-supervised representation learning. We perform this analysis from the perspective of the training algorithms, pre-training datasets and end tasks. We examine over 700 training experiments including 30 encoders, 4 pre-training datasets and 20 diverse downstream tasks. Our experiments address various questions regarding the performance of self-supervised models compared to their supervised counterparts, current benchmarks used for evaluation, and the effect of the pre-training data on end task performance. Our Visual Representation Benchmark (ViRB) is available at: https://github.com/allenai/virb.
ISSN:2380-7504
DOI:10.1109/ICCV48922.2021.00980