Sublinear Time and Space Algorithms for Correlation Clustering via Sparse-Dense Decompositions
Abstract:
We present a new approach for solving (minimum disagreement) correlation clustering that results in
sublinear algorithms with highly efficient time and space complexity for this problem. In particular, we
obtain the following algorithms for n-vertex (+/−)-labeled graphs G:
- A sublinear-time algorithm that with high probability returns a constant approximation clustering
of G in O(n log2 n) time assuming access to the adjacency list of the (+)-labeled edges of G (this
is almost quadratically faster than even reading the input once). Previously, no sublinear-time
algorithm was known for this problem with any multiplicative approximation guarantee.
- A semi-streaming algorithm that with high probability returns a constant approximation clustering
of G in O(n log n) space and a single pass over the edges of the graph G (this memory is almost
quadratically smaller than input size). Previously, no single-pass algorithm with o(n^2) space was
known for this problem with any approximation guarantee.
The main ingredient of our approach is a novel connection to sparse-dense graph decompositions
that are used extensively in the graph coloring literature. To our knowledge, this connection is the first
application of these decompositions beyond graph coloring, and in particular for the correlation clustering
problem, and can be of independent interest.
Conference version:
[PDF]