Relation-aware aggregation network with auxiliary guidance for text-based person search
In this paper, we propose a novel Relation-aware Aggregation Network with Auxiliary Guidance for text-based person search, namely RANAG. Existing works are still hard to capture the detailed appearance of a person and compute the similarity between images and texts. RANAN is designed to address the...
Saved in:
Published in: | World wide web (Bussum) Vol. 25; no. 4; pp. 1565 - 1582 |
---|---|
Main Authors: | , , , , , , |
Format: | Journal Article |
Language: | English |
Published: |
New York
Springer US
01-07-2022
Springer Nature B.V |
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | In this paper, we propose a novel Relation-aware Aggregation Network with Auxiliary Guidance for text-based person search, namely RANAG. Existing works are still hard to capture the detailed appearance of a person and compute the similarity between images and texts. RANAN is designed to address the above problem from two aspects: relation-aware visual and additional auxiliary signal. Specifically, we introduce a Relation-aware Aggregation Network (RAN) that exploits the relation between the person and local objects. Then, we propose three auxiliary tasks to acquire additional knowledge of semantic representations. Each task has a respective objective: identifying the gender of the pedestrian in the image, distinguishing the images of the similar pedestrian, and aligning the semantic information between description and image. In addition, some data augmentation methods we explored can further improve the performance. Extensive experiments demonstrate that our model provides superior performance than the state-of-the-art methods on the CUHK-PEDES dataset. |
---|---|
ISSN: | 1386-145X 1573-1413 |
DOI: | 10.1007/s11280-021-00953-9 |