Receptive Multi-Granularity Representation for Person Re-Identification

A key for person re-identification is achieving consistent local details for discriminative representation across variable environments. Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on image processing Vol. 29; pp. 6096 - 6109
Main Authors:	Wang, Guanshuo, Yuan, Yufeng, Li, Jiwei, Ge, Shiming, Zhou, Xi
Format:	Journal Article
Language:	English
Published:	United States IEEE 01-01-2020 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Adaptation models Computer architecture convolutional neural networks Datasets Feature extraction Image enhancement Learning local feature learning Machine learning Misalignment multiple granularity learning Neurons Partitions Person re-identification Representations Robustness Semantics Task analysis
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	A key for person re-identification is achieving consistent local details for discriminative representation across variable environments. Current stripe-based feature learning approaches have delivered impressive accuracy, but do not make a proper trade-off between diversity, locality, and robustness, which easily suffers from part semantic inconsistency for the conflict between rigid partition and misalignment. This paper proposes a receptive multi-granularity learning approach to facilitate stripe-based feature learning. This approach performs local partition on the intermediate representations to operate receptive region ranges, rather than current approaches on input images or output features, thus can enhance the representation of locality while remaining proper local association. Toward this end, the local partitions are adaptively pooled by using significance-balanced activations for uniform stripes. Random shifting augmentation is further introduced for a higher variance of person appearing regions within bounding boxes to ease misalignment. By two-branch network architecture, different scales of discriminative identity representation can be learned. In this way, our model can provide a more comprehensive and efficient feature representation without larger model storage costs. Extensive experiments on intra-dataset and cross-dataset evaluations demonstrate the effectiveness of the proposed approach. Especially, our approach achieves a state-of-the-art accuracy of 96.2%@Rank-1 or 90.0%@mAP on the challenging Market-1501 benchmark.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23
ISSN:	1057-7149 1941-0042
DOI:	10.1109/TIP.2020.2986878