BayBFed: Bayesian Backdoor Defense for Federated Learning
Federated learning (FL) allows participants to jointly train a machine learning model without sharing their private data with others. However, FL is vulnerable to poisoning attacks such as backdoor attacks. Consequently, a variety of defenses have recently been proposed, which have primarily utilize...
Saved in:
Main Authors: | , , , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
23-01-2023
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Federated learning (FL) allows participants to jointly train a machine
learning model without sharing their private data with others. However, FL is
vulnerable to poisoning attacks such as backdoor attacks. Consequently, a
variety of defenses have recently been proposed, which have primarily utilized
intermediary states of the global model (i.e., logits) or distance of the local
models (i.e., L2-norm) from the global model to detect malicious backdoors.
However, as these approaches directly operate on client updates, their
effectiveness depends on factors such as clients' data distribution or the
adversary's attack strategies. In this paper, we introduce a novel and more
generic backdoor defense framework, called BayBFed, which proposes to utilize
probability distributions over client updates to detect malicious updates in
FL: it computes a probabilistic measure over the clients' updates to keep track
of any adjustments made in the updates, and uses a novel detection algorithm
that can leverage this probabilistic measure to efficiently detect and filter
out malicious updates. Thus, it overcomes the shortcomings of previous
approaches that arise due to the direct usage of client updates; as our
probabilistic measure will include all aspects of the local client training
strategies. BayBFed utilizes two Bayesian Non-Parametric extensions: (i) a
Hierarchical Beta-Bernoulli process to draw a probabilistic measure given the
clients' updates, and (ii) an adaptation of the Chinese Restaurant Process
(CRP), referred by us as CRP-Jensen, which leverages this probabilistic measure
to detect and filter out malicious updates. We extensively evaluate our defense
approach on five benchmark datasets: CIFAR10, Reddit, IoT intrusion detection,
MNIST, and FMNIST, and show that it can effectively detect and eliminate
malicious updates in FL without deteriorating the benign performance of the
global model. |
---|---|
DOI: | 10.48550/arxiv.2301.09508 |