Deep correspondence restricted Boltzmann machine for cross-modal retrieval
The task of cross-modal retrieval, i.e., using a text query to search for images or vice versa, has received considerable attention with the rapid growth of multi-modal web data. Modeling the correlations between different modalities is the key to tackle this problem. In this paper, we propose a cor...
Saved in:
Published in: | Neurocomputing (Amsterdam) Vol. 154; pp. 50 - 60 |
---|---|
Main Authors: | , , |
Format: | Journal Article |
Language: | English |
Published: |
Elsevier B.V
22-04-2015
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | The task of cross-modal retrieval, i.e., using a text query to search for images or vice versa, has received considerable attention with the rapid growth of multi-modal web data. Modeling the correlations between different modalities is the key to tackle this problem. In this paper, we propose a correspondence restricted Boltzmann machine (Corr-RBM) to map the original features of bimodal data, such as image and text in our setting, into a low-dimensional common space, in which the heterogeneous data are comparable. In our Corr-RBM, two RBMs built for image and text, respectively are connected at their individual hidden representation layers by a correlation loss function. A single objective function is constructed to trade off the correlation loss and likelihoods of both modalities. Through the optimization of this objective function, our Corr-RBM is able to capture the correlations between two modalities and learn the representation of each modality simultaneously. Furthermore, we construct two deep neural structures using Corr-RBM as the main building block for the task of cross-modal retrieval. A number of comparison experiments are performed on three public real-world data sets. All of our models show significantly better results than state-of-the-art models in both searching images via text query and vice versa. |
---|---|
ISSN: | 0925-2312 1872-8286 |
DOI: | 10.1016/j.neucom.2014.12.020 |