|
|
Instructor |
Sepehr Assadi |
Credits |
3 units |
Schedule |
Tuesdays 12:00 PM - 3:00 PM (Online) |
Prerequisites |
Background on (randomized) algorithms, complexity theory, discrete mathematics, and probability theory (basic concentration inequalities).
Mathematical maturity and some experience with theoretical computer science beyond the introductory level, say, Advanced Algorithms (CS 513, CS 514, or an equivalent) is necessary. |
Syllabus |
The full course syllabus is available here. This webpage contains the highlights of course syllabus that are potentially updated as the semester progresses. |
Massive graphs abound: web-graphs, social networks, or biological networks are just a few examples. However, the traditional algorithmic approaches to these problems do not accurately capture
the challenges of processing massive graphs. For instance, we can no longer assume a (fast) random access to the entire input on these graphs, as their sheer size prevents us from loading them into the main memory.
As a result, there is a rapidly growing interest in developing algorithms that explicitly account for the restrictions of processing massive inputs. One particularly successful approach here is
graph streaming algorithms. These are algorithms that process the input graphs by making one or few sequential passes over their edges while using a limited memory---these constraints capture several challenges of processing massive graphs such as I/O-efficiency or monitoring evolving graphs.
In this course, we will study the general picture of the landscape of powers and limitations of graph streaming algorithms, some of the key ideas
in establishing these results, and some of the main open questions and gaps in our understanding of this model.
This is a seminar course organized in the format of a focused reading group on the topic of graph streaming algorithms and lower bounds. The courseload involves writting criticial reviews,
solving algorithmic problems, and presenting papers.
- Instructor: Sepehr Assadi
- Instructor Email: sepehr.assadi@rutgers.edu
- Lecture Schedule: Tuesdays 12:00 PM - 3:00 PM (Online)
- Office hours: Mondays 3:00 PM to 4:00 PM, or by appointment (Online)
- This course has no recitation sections
This is a
synchronous online course:
- Online lectures: The class will be taught via Zoom. This is a seminar course and the students are expected to attend every lecture. Moreover, the students are required to keep their videos ON during the class; please contact the Instructor in advance if you have a valid reason for not doing so.
- Technology recommendation: The students are expected to have a working microphone for discussions during the lectures and for their own presentations.
For their presentations, the students may choose to use slides and/or
an online blackboard; a stylus is necessary for the latter option as trackpads and external mouses do not work fine for an entire presentation on a board.
- Recordings: The lectures may be recorded for students who are not able to attend the lecture. They will be posted only on Canvas. These recordings are solely for the students registered in the course and are not to be redistributed
outside of this class.
The final grade for the course will be based on the following weights:
- 25% Reviews
- 25% Problem sets
- 50% Presentations
Students are expected to follow
Rutgers academic integrity policy for all their work in this course.
The following is a summary of each of these components; see the
course syllabus for more information.
Reviews: A key component of this course is to write critical reviews for the discussed papers in a format of conference reviews for TCS papers.
The main purpose of the reviews is to familiarize the students with each paper before they are being discussed in the class.
The reviews are for the weekly assigned readings and are due on
11:59pm EST on Monday each week,
the day before the corresponding paper is being discussed in the class. The reviews should be turned in on
Canvas; please use this
template
for your answers.
Problem sets: Each week, we will also have a single algorithmic or lower bound problem on the topic of the discussed paper.
The main goal of these problem sets is to allow the students to test their understanding of the discussed topics in more depths. The solutions to these problemsets must be typeset in LaTeX and submitted via
Canvas by 11:59pm EST on
Tuesday the problem set is due; see
LaTeX section for some resources on using LaTeX. The students are allowed to discuss the problem sets among themselves and there is no limit on the number of involved students. However, (1) the students should write their solutions completely independently (in particular, you should understand and be able to explain everything that is written in your solution); (2) you should include the name of your collaborators in your solutions.
Presentations: Each student is expected to present one or two papers to the class during the semester. This is the main component of the course for each student and involves
the following three parts: (i) preparing the materials for the presentation (such as slides)
one week before the in-class presentation to receive feedback from the Instructor; (ii) an in-class presentation
for
an entire lecture including the discussion of the main result, its proof, and any necessary background; (iii) a report on the presented paper including the proof of the main result presented in the class---the target of these notes are other students in the class, and the notes in particular should help other students to answer the problem set given for this presentation. The students do not need to write a review nor turn in a solution for the problem set for the paper they are presenting themselves.
The list of papers for presentation along with their allocated time slot is available on the
Course Calendar: the lectures marked with * are for students' presentation---the main paper
to present is the assigned reading paper but in some cases, the extra references give a simpler proof of the same result; in that case, the proof can be presented from the other papers.
Students should send a list of three papers they would like to present to the Instructor by
September 14th, 11:59PM EST.
The following is a schedule of the lectures. The lectures marked with * are for students' presentations. The lecture notes marked as draft have not been proofread by
the Instructor yet.
# |
Date |
Topics |
Assigned Reading |
Extra References |
Notes |
Psets |
1 |
Tue 09/01 |
Course Policy, Graph Streaming, Communication Complexity |
-- |
FKMSZ04, B08 |
[lec1] |
[pset1] |
|
Tue 09/08 |
No Class: Monday's schedule |
|
|
|
|
2 |
Tue 09/15 |
Maximum Cardinality Matching in a Single Pass |
GKK12 |
K13, AB19, KMM12, B20 |
[lec2] |
[pset2] |
3 |
Tue 09/22 |
Palette Sparsification: (Δ+1) Coloring in a Single Pass |
ACK19a |
AA20, BCG20 |
[lec3] |
[pset3] |
4 |
Tue 09/29 |
Local Ratio: Maximum Weight Matching in a Single Pass* |
PS17 |
GW19, FKMSZ04, CS14 |
[lec4] |
[pset4] |
5 |
Tue 10/06 |
Graph Sketching and Dynamic Streams: Min Cut and Sparsifiers* |
AGM12b |
AGM12a, KLMMS14 |
[lec5] |
[pset5] |
6 |
Tue 10/13 |
Graph Sketching and the SMP model: Maximum Matching* |
AKLY16 |
DK20, AKL17 |
[lec6-draft] |
[pset6] |
7 |
Tue 10/20 |
Information Complexity: Streaming Coverage Problems |
AKL16 |
ER14, HIMV16, CW16, MV17 |
[lec7]
|
[pset7]
|
8 |
Tue 10/27 |
Sample-and-Prune for Multi-Pass Graph Streaming Algorithms |
KMVV13 |
ACGMW15 |
|
[pset8]
|
9 |
Tue 11/03 |
Iterative Methods: Weighted Matching in Multiple Passes* |
AG11 |
AG15 |
|
[pset9]
|
10 |
Tue 11/10 |
Iterative Methods: Shortest Path in Multiple Passes* |
BFKL17 |
|
[lec10-draft]
|
[pset10]
|
11 |
Tue 11/17 |
Sparsificiation: Exact Minimum Cut in Two Passes* |
-- |
RSW18 |
[lec11-draft]
|
[pset11]
|
12 |
Tue 11/24 |
Pointer Chasing for Multipass Graph Streaming Lower Bounds* |
GO13 |
FKMSZ05, ACK19b, AR20 |
[lec12-draft]
|
-- |
13 |
Tue 12/01 |
Maximum Matching in Random-Order Streams (by A. Bernstein) |
B20 |
ABBMS19, FHMRR20 |
-- |
[pset12]
|
14 |
Tue 12/08 |
Subgraph Counting in Small Space* |
MVV16 |
BC17, BFLMS06 |
|
|
There is no official textbook for this course and all required materials will be posted on this webpage.
The following is a list of some helpful supplementary materials:
This is a list of the papers related to the topics discussed in the lectures. The list may be updated throughout the semester to add the relevant papers.
Disclaimer: This course only covers a small subset of the vast literature in graph streaming and the discussed papers are by no means a comprehensive list of the entire area.
ACGMW15 |
Kook Jin Ahn, Graham Cormode, Sudipto Guha, Andrew McGregor, Anthony Wirth,
Correlation Clustering in Data Streams. ICML 2015. |
AG11 |
Kook Jin Ahn, Sudipto Guha,
Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem. ICALP 2011. |
AG15 |
Kook Jin Ahn, Sudipto Guha,
Access to Data and Number of Iterations: Dual Primal Algorithms for Maximum Matching under Resource Constraints. SPAA 2015, TOPC 2018. |
AGM12a |
Kook Jin Ahn, Sudipto Guha, Andrew McGregor,
Analyzing Graph Structure via Linear Measurements. SODA 2012. |
AGM12b |
Kook Jin Ahn, Sudipto Guha, Andrew McGregor,
Graph sketches: sparsification, spanners, and subgraphs. PODS 2012. |
AA20 |
Noga Alon, Sepehr Assadi, Palette Sparsification Beyond (Δ+1) Vertex Coloring.
RANDOM 2020. |
AB19 |
Sepehr Assadi, Aaron Bernstein, Towards a Unified Theory of Sparsification for Matching Problems.
SOSA@SODA 2019. |
ABBMS19 |
Sepehr Assadi, MohammadHossein Bateni, Aaron Bernstein, Vahab Mirrokni, Cliff Stein,
Coresets Meet EDCS: Algorithms for Matching and Vertex Cover on Massive Graphs.
SODA 2019. |
ACK19a |
Sepehr Assadi, Yu Chen, Sanjeev Khanna,
Sublinear Algorithms for (Δ+1) Vertex Coloring. SODA 2019. |
ACK19b |
Sepehr Assadi, Yu Chen, Sanjeev Khanna,
Polynomial Pass Lower Bounds for Graph Streaming Algorithms. STOC 2019. |
AKL16 |
Sepehr Assadi, Sanjeev Khanna, Yang Li, Tight Bounds for Single-Pass Streaming Complexity of the Set Cover Problem.
STOC 2016, SICOMP 2019. |
AKL17 |
Sepehr Assadi, Sanjeev Khanna, Yang Li, On Estimating Maximum Matching Size in Graph Streams.
SODA 2017. |
AKLY16 |
Sepehr Assadi, Sanjeev Khanna, Yang Li, Grigory Yaroslavtsev, Maximum Matchings in Dynamic Graph Streams and the Simultaneous Communication Model.
SODA 2016. |
AR20 |
Sepehr Assadi, Ran Raz, Near-Quadratic Lower Bounds for Two-Pass Graph Streaming Algorithms.
FOCS 2020. |
B08 |
Surender Baswana,
Streaming algorithm for graph spanners - single pass and constant processing time per edge..
Inf. Process. Lett 2008. |
BFKL17 |
Ruben Becker, Sebastian Forster, Andreas Karrenbauer, Christoph Lenzen,
Near-Optimal Approximate Shortest Paths and Transshipment in Distributed and Streaming Models.
DISC 2017. |
BCG20 |
Suman K. Bera, Amit Chakrabarti, Prantar Ghosh, Graph Coloring via Degeneracy in Streaming and Other Space-Conscious Models.
ICALP 2020. |
BC17 |
Suman K. Bera, Amit Chakrabarti, Towards Tighter Space Bounds for Counting Triangles and Other Substructures in Graph Streams.
STACS 2017. |
B20 |
Aaron Bernstein, Improved Bound for Matching in Random-Order Streams.
ICALP 2020. |
BFLMS06 |
Luciana S. Buriol, Gereon Frahling, Stefano Leonardi, Alberto Marchetti-Spaccamela, Christian Sohler, Counting triangles in data streams.
PODS 2016. |
CW16 |
Amit Chakrabarti, Anthony Wirth Incidence Geometries and the Pass Complexity of Semi-Streaming Set Cover.
SODA 2016. |
CS14 |
Michael Crouch, Daniel S. Stubbs Improved Streaming Algorithms for Weighted Matching, via Unweighted Matching.
APPROX-RANDOM 2014. |
DK20 |
Jacques Dark, Christian Konrad Optimal Lower Bounds for Matching and Vertex Cover in Dynamic Graph Streams.
CCC 2020. |
ER14 |
Yuval Emek, Adi Rosén Semi-Streaming Set Cover.
ICALP 2014, TALG 2016. |
FHMRR20 |
Alireza Farhadi, Mohammad Taghi Hajiaghayi, Tung Mai, Anup Rao, Ryan A. Rossi, Approximate Maximum Matching in Random Streams. SODA2020. |
FKMSZ04 |
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, Jian Zhang, On Graph Problems in a Semi-streaming Model.
ICALP 2004, Theor. Comput. Sci. 2005. |
FKMSZ05 |
Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, Jian Zhang,
Graph Distances in the Data-Stream Model.
SODA 2005, SICOMP 2008. |
GW19 |
Mohsen Ghaffari, David Wajc, Simplified and Space-Optimal Semi-Streaming (2+ε)-Approximate Matching.
SOSA@SODA 2019. |
GKK12 |
Ashish Goel, Michael Kapralov, Sanjeev Khanna,
On the communication and streaming complexity of maximum bipartite matching.
SODA 2012. |
GO13 |
Venkatesan Guruswami, Krzysztof Onak,
Superlinear lower bounds for multipass graph processing.
CCC 2013, Algorithmica 2016. |
HIMV16 |
Sariel Har-Peled, Piotr Indyk, Sepideh Mahabadi, Ali Vakilian,
Towards Tight Bounds for the Streaming Set Cover Problem.
PODS 2016. |
K13 |
Michael Kapralov Better bounds for matchings in the streaming model.
SODA 2013. |
KKS15 |
Michael Kapralov, Sanjeev Khanna, Madhu Sudan, Streaming Lower Bounds for Approximating MAX-CUT.
SODA 2015. |
KK19 |
Michael Kapralov, Dmitry Krachun An Optimal Space Lower Bound for Approximating MAX-CUT.
SODA 2013. |
KLMMS14 |
Michael Kapralov, Yin Tat Lee, Cameron Musco, Christopher Musco, Aaron Sidford, Single Pass Spectral Sparsification in Dynamic Streams.
FOCS 2014, SICOMP 2017. |
KMM12 |
Christian Konrad, Frédéric Magniez, Claire Mathieu, Maximum Matching in Semi-Streaming with Few Passes.
APPROX-RANDOM 2012. |
KMVV13 |
Ravi Kumar, Benjamin Moseley, Sergei Vassilvitskii, Andrea Vattani, Fast greedy algorithms in mapreduce and streaming.
SPAA 2013, TOPC 2015. |
MV17 |
Andrew McGregor, Hoa T. Vu, Better Streaming Algorithms for the Maximum Coverage Problem.
ICDT 2017, Theory Comput. Syst. 2019. |
MVV16 |
Andrew McGregor, Sofya Vorotnikova, Hoa T. Vu, Better Algorithms for Counting Triangles in Data Streams.
PODS 2016. |
PS17 |
Ami Paz, Gregory Schwartzman A (2 + ε)-Approximation for Maximum Weight Matching in the Semi-Streaming Model.
SODA 2017. |
RSW18 |
Aviad Rubinstein, Tselil Schramm, S. Matthew Weinberg, Computing Exact Minimum Cuts Without Knowing the Graph. ITCS 2018. |
VY11 |
Elad Verbin, Wei Yu, The Streaming Complexity of Cycle Counting, Sorting by Reversals, and Other Problems. SODA 2011. |
You can download LaTeX for free
here. For the purpose of this course, you do not even need to install LaTeX and
can instead use an online LaTeX editor such as
Overleaf.
Two great introductory resources for LaTeX are
A Short Introduction to LaTeX by
Allin Cottrell
(for general purpose LaTeX)
and
LaTeX for Undergraduates by
Jim Hefferson (for undergraduates mathematics)
accompanied by the following
cheatsheet (note that this document use "\( MATH \)" notation compared to the perhaps more
widely used "$ MATH $" -- both are completely fine in LaTeX).
You can also use this wonderful tool
Detexify by
Daniel Kirsch for finding the
LaTeX commands of a symbol (just draw the symbol!).
If you are interested in learning more about LaTeX (beyond what is needed for this course), check the
Wikibook on LaTeX and
the
Wikibook on LaTeX for Mathematics.