Information maximization clustering via multi-view self-labelling

Image clustering is a particularly challenging computer vision task, which aims to generate annotations without human supervision. Recent advances focus on the use of self-supervised learning strategies in image clustering, by first learning valuable semantics and then clustering the image represent...

Full description

Saved in:

Bibliographic Details
Published in:	Knowledge-based systems Vol. 250; p. 109042
Main Authors:	Ntelemis, Foivos, Jin, Yaochu, Thomas, Spencer A.
Format:	Journal Article
Language:	English
Published:	Elsevier B.V 17-08-2022
Subjects:	Deep neural models Image clustering Mutual information maximization Self-supervised learning Unsupervised learning Image clustering Deep neural models Mutual information maximization Unsupervised learning Self-supervised learning
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	Image clustering is a particularly challenging computer vision task, which aims to generate annotations without human supervision. Recent advances focus on the use of self-supervised learning strategies in image clustering, by first learning valuable semantics and then clustering the image representations. These multiple-phase algorithms, however, involve several hyper-parameters and transformation functions, and are computationally intensive. By extending the grouping based self-supervised approach, this work proposes a novel single-phase clustering method that simultaneously learns meaningful representations and assigns the corresponding annotations. This is achieved by integrating a discrete representation into the self-supervised paradigm through a classifier net. Specifically, the proposed clustering objective employs mutual information to maximise the dependency of the integrated discrete representation on a discrete probability distribution. The discrete probability distribution is derived by means of a self-supervised process that compares the learnt latent representation with a set of trainable prototypes. To enhance the learning performance of the classifier, we jointly apply the mutual information across multi-crop views. Our empirical results show that the proposed framework outperforms state-of-the-art techniques with an average clustering accuracy of 89.1%, 49.0%, 83.1%, and 27.9%, respectively, on the baseline datasets of CIFAR-10, CIFAR-100/20, STL10 and Tiny-ImageNet/200. Finally, the proposed method also demonstrates attractive robustness to parameter settings, and to a large number of classes, making it ready to be applicable to other datasets. •Modifying and leveraging the mutual information to cluster the image data through an over-clustering distribution.•Converting grouping-based self-supervised methods into multi-functional frameworks.•The clustering is achieved jointly in a single-phase training process without increasing the training stages or hyper-parameters.
ISSN:	0950-7051 1872-7409
DOI:	10.1016/j.knosys.2022.109042