Learning the Unlearnable: Adversarial Augmentations Suppress Unlearnable Example Attacks
Unlearnable example attacks are data poisoning techniques that can be used to safeguard public data against unauthorized use for training deep learning models. These methods add stealthy perturbations to the original image, thereby making it difficult for deep learning models to learn from these tra...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
27-03-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Unlearnable example attacks are data poisoning techniques that can be used to
safeguard public data against unauthorized use for training deep learning
models. These methods add stealthy perturbations to the original image, thereby
making it difficult for deep learning models to learn from these training data
effectively. Current research suggests that adversarial training can, to a
certain degree, mitigate the impact of unlearnable example attacks, while
common data augmentation methods are not effective against such poisons.
Adversarial training, however, demands considerable computational resources and
can result in non-trivial accuracy loss. In this paper, we introduce the
UEraser method, which outperforms current defenses against different types of
state-of-the-art unlearnable example attacks through a combination of effective
data augmentation policies and loss-maximizing adversarial augmentations. In
stark contrast to the current SOTA adversarial training methods, UEraser uses
adversarial augmentations, which extends beyond the confines of $ \ell_p $
perturbation budget assumed by current unlearning attacks and defenses. It also
helps to improve the model's generalization ability, thus protecting against
accuracy loss. UEraser wipes out the unlearning effect with error-maximizing
data augmentations, thus restoring trained model accuracies. Interestingly,
UEraser-Lite, a fast variant without adversarial augmentations, is also highly
effective in preserving clean accuracies. On challenging unlearnable CIFAR-10,
CIFAR-100, SVHN, and ImageNet-subset datasets produced with various attacks, it
achieves results that are comparable to those obtained during clean training.
We also demonstrate its efficacy against possible adaptive attacks. Our code is
open source and available to the deep learning community:
https://github.com/lafeat/ueraser. |
---|---|
DOI: | 10.48550/arxiv.2303.15127 |