PViT: Prior-augmented Vision Transformer for Out-of-distribution Detection
Vision Transformers (ViTs) have achieved remarkable success over various vision tasks, yet their robustness against data distribution shifts and inherent inductive biases remain underexplored. To enhance the robustness of ViT models for image Out-of-Distribution (OOD) detection, we introduce a novel...
Saved in:
Main Authors: | , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
27-10-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Vision Transformers (ViTs) have achieved remarkable success over various
vision tasks, yet their robustness against data distribution shifts and
inherent inductive biases remain underexplored. To enhance the robustness of
ViT models for image Out-of-Distribution (OOD) detection, we introduce a novel
and generic framework named Prior-augmented Vision Transformer (PViT). PViT
identifies OOD samples by quantifying the divergence between the predicted
class logits and the prior logits obtained from pre-trained models. Unlike
existing state-of-the-art OOD detection methods, PViT shapes the decision
boundary between ID and OOD by utilizing the proposed prior guide confidence,
without requiring additional data modeling, generation methods, or structural
modifications. Extensive experiments on the large-scale ImageNet benchmark
demonstrate that PViT significantly outperforms existing state-of-the-art OOD
detection methods. Additionally, through comprehensive analyses, ablation
studies, and discussions, we show how PViT can strategically address specific
challenges in managing large vision models, paving the way for new advancements
in OOD detection. |
---|---|
DOI: | 10.48550/arxiv.2410.20631 |