Mitigating Disparate Impact of Differential Privacy in Federated Learning through Robust Clustering
Federated Learning (FL) is a decentralized machine learning (ML) approach that keeps data localized and often incorporates Differential Privacy (DP) to enhance privacy guarantees. Similar to previous work on DP in ML, we observed that differentially private federated learning (DPFL) introduces perfo...
Saved in:
Main Authors: | , , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
29-05-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | Federated Learning (FL) is a decentralized machine learning (ML) approach
that keeps data localized and often incorporates Differential Privacy (DP) to
enhance privacy guarantees. Similar to previous work on DP in ML, we observed
that differentially private federated learning (DPFL) introduces performance
disparities, particularly affecting minority groups. Recent work has attempted
to address performance fairness in vanilla FL through clustering, but this
method remains sensitive and prone to errors, which are further exacerbated by
the DP noise in DPFL. To fill this gap, in this paper, we propose a novel
clustered DPFL algorithm designed to effectively identify clients' clusters in
highly heterogeneous settings while maintaining high accuracy with DP
guarantees. To this end, we propose to cluster clients based on both their
model updates and training loss values. Our proposed approach also addresses
the server's uncertainties in clustering clients' model updates by employing
larger batch sizes along with Gaussian Mixture Model (GMM) to alleviate the
impact of noise and potential clustering errors, especially in
privacy-sensitive scenarios. We provide theoretical analysis of the
effectiveness of our proposed approach. We also extensively evaluate our
approach across diverse data distributions and privacy budgets and show its
effectiveness in mitigating the disparate impact of DP in FL settings with a
small computational cost. |
---|---|
DOI: | 10.48550/arxiv.2405.19272 |