On the Use of Anchoring for Training Vision Models
Anchoring is a recent, architecture-agnostic principle for training deep neural networks that has been shown to significantly improve uncertainty estimation, calibration, and extrapolation capabilities. In this paper, we systematically explore anchoring as a general protocol for training vision mode...
Saved in:
Main Authors: | , , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
01-06-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Anchoring is a recent, architecture-agnostic principle for training deep
neural networks that has been shown to significantly improve uncertainty
estimation, calibration, and extrapolation capabilities. In this paper, we
systematically explore anchoring as a general protocol for training vision
models, providing fundamental insights into its training and inference
processes and their implications for generalization and safety. Despite its
promise, we identify a critical problem in anchored training that can lead to
an increased risk of learning undesirable shortcuts, thereby limiting its
generalization capabilities. To address this, we introduce a new anchored
training protocol that employs a simple regularizer to mitigate this issue and
significantly enhances generalization. We empirically evaluate our proposed
approach across datasets and architectures of varying scales and complexities,
demonstrating substantial performance gains in generalization and safety
metrics compared to the standard training protocol. |
---|---|
DOI: | 10.48550/arxiv.2406.00529 |