An Efficient Framework for Clustered Federated Learning

We address the problem of federated learning (FL) where users are distributed and partitioned into clusters. This setup captures settings where different groups of users have their own objectives (learning tasks) but by aggregating their data with others in the same cluster (same learning task), the...

Full description

Saved in:

Bibliographic Details
Published in:	IEEE transactions on information theory Vol. 68; no. 12; pp. 8076 - 8091
Main Authors:	Ghosh, Avishek, Chung, Jichan, Yin, Dong, Ramchandran, Kannan
Format:	Journal Article
Language:	English
Published:	New York IEEE 01-12-2022 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects:	Algorithms alternating minimization Clustering Clustering algorithms Cognitive tasks Collaborative work Convergence Data models Distributed databases Federated learning Iterative methods Machine learning Neural networks Partitioning algorithms Task analysis
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We address the problem of federated learning (FL) where users are distributed and partitioned into clusters. This setup captures settings where different groups of users have their own objectives (learning tasks) but by aggregating their data with others in the same cluster (same learning task), they can leverage the strength in numbers in order to perform more efficient federated learning. For this new framework of clustered federated learning, we propose the Iterative Federated Clustering Algorithm (IFCA), which alternately estimates the cluster identities of the users and optimizes model parameters for the user clusters via gradient descent. We analyze the convergence rate of this algorithm first in a linear model with squared loss and then for generic strongly convex and smooth loss functions. We show that in both settings, with good initialization, IFCA is guaranteed to converge, and discuss the optimality of the statistical error rate. In particular, for the linear model with two clusters, we can guarantee that our algorithm converges as long as the initialization is slightly better than random. When the clustering structure is ambiguous, we propose to train the models by combining IFCA with the weight sharing technique in multi-task learning. In the experiments, we show that our algorithm can succeed even if we relax the requirements on initialization with random initialization and multiple restarts. We also present experimental results showing that our algorithm is efficient in non-convex problems such as neural networks. We demonstrate the benefits of IFCA over the baselines on several clustered FL benchmarks.
ISSN:	0018-9448 1557-9654
DOI:	10.1109/TIT.2022.3192506