Massively Parallel Maximum Coverage Revisited
We study the maximum set coverage problem in the massively parallel model. In this setting, $m$ sets that are subsets of a universe of $n$ elements are distributed among $m$ machines. In each round, these machines can communicate with each other, subject to the memory constraint that no machine may...
Saved in:
Main Authors: | , |
---|---|
Format: | Journal Article |
Language: | English |
Published: |
17-11-2024
|
Subjects: | |
Online Access: | Get full text |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | We study the maximum set coverage problem in the massively parallel model. In
this setting, $m$ sets that are subsets of a universe of $n$ elements are
distributed among $m$ machines. In each round, these machines can communicate
with each other, subject to the memory constraint that no machine may use more
than $\tilde{O}(n)$ memory. The objective is to find the $k$ sets whose
coverage is maximized. We consider the regime where $k = \Omega(m)$, $m =
O(n)$, and each machine has $\tilde{O}(n)$ memory. Maximum coverage is a
special case of the submodular maximization problem subject to a cardinality
constraint. This problem can be approximated to within a $1-1/e$ factor using
the greedy algorithm, but this approach is not directly applicable to parallel
and distributed models. When $k = \Omega(m)$, to obtain a $1-1/e-\epsilon$
approximation, previous work either requires $\tilde{O}(mn)$ memory per machine
which is not interesting compared to the trivial algorithm that sends the
entire input to a single machine, or requires $2^{O(1/\epsilon)} n$ memory per
machine which is prohibitively expensive even for a moderately small value
$\epsilon$. Our result is a randomized $(1-1/e-\epsilon)$-approximation
algorithm that uses $O(1/\epsilon^3 \cdot \log m \cdot (\log (1/\epsilon) +
\log m))$ rounds. Our algorithm involves solving a slightly transformed linear
program of the maximum coverage problem using the multiplicative weights update
method, classic techniques in parallel computing such as parallel prefix, and
various combinatorial arguments. |
---|---|
DOI: | 10.48550/arxiv.2411.11277 |