Randomized Distributed Mean Estimation: Accuracy vs. Communication

We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. Th...

Full description

Saved in:

Bibliographic Details
Published in:	Frontiers in applied mathematics and statistics Vol. 4
Main Authors:	Konečný, Jakub, Richtárik, Peter
Format:	Journal Article
Language:	English
Published:	Frontiers Media S.A 18-12-2018
Subjects:	accuracy-communication tradeoff communication efficiency distributed mean estimation gradient compression quantization
Online Access:	Get full text
Tags:	Add Tag No Tags, Be the first to tag this record!

Description
Summary:	We consider the problem of estimating the arithmetic average of a finite collection of real vectors stored in a distributed fashion across several compute nodes subject to a communication budget constraint. Our analysis does not rely on any statistical assumptions about the source of the vectors. This problem arises as a subproblem in many applications, including reduce-all operations within algorithms for distributed and federated optimization and learning. We propose a flexible family of randomized algorithms exploring the trade-off between expected communication cost and estimation error. Our family contains the full-communication and zero-error method on one extreme, and an ϵ-bit communication and O(1/(∈n)) error method on the opposite extreme. In the special case where we communicate, in expectation, a single bit per coordinate of each vector, we improve upon existing results by obtaining O(r/n) error, where r is the number of bits used to represent a floating point value.
ISSN:	2297-4687 2297-4687
DOI:	10.3389/fams.2018.00062