||Tuesdays 3:00 PM - 6:00 PM in
TIL-226 (Livingston campus)
|| Undergraduate courses on algorithms, complexity theory, discrete mathematics, and probability; mathematical maturity.
||The full course syllabus is available here. This webpage contains the highlights of course syllabus that are potentially updated as the semester progresses.
With the emergence of massive datasets across different application domains, there is a rapidly growing
interest in solving various problems over immense amounts of data. However, even most basic algorithms can become computationally prohibitive when processing massive datasets as the inputs are often too large
to be stored in one place or read even once. As a result, a new set of algorithmic tools and ideas are needed for computing with exteremly constrained resources. This is the focus of
sublinear algorithms, namely, algorithms whose resource requirements (e.g. time or space) are substantially smaller than the size of the input that they operate on.
We will study various advanced algorithmic ideas through the lens of sublinear algorithms in this course. In particular, we consider two most canonical models of sublinear algorithms, namely, sublinear time algorithms and streaming algorithms,
and cover several key algorithmic techniques in these (and related) models, as well as discuss limitations inherent to computing with constrained resources.
- Instructor: Sepehr Assadi
- Instructor Email: email@example.com
- Lecture Schedule: Tuesdays 3:00 PM - 6:00 PM in TIL-226 (Livingston campus)
- Office hours: Thursdays 3:00 PM to 4:00 PM on Zoom (email the Instructor for the Zoom link)
This course has no recitation sections.
In accordance with Rutger’s policy, masks must be worn during class meetings. See the course syllabus
for more details.
The following is a tentative list of topics that will be covered in this course.
- Sublinear Time Algorithms: Which problems can be solved in time faster than even reading the entire input once?
We will cover sublinear time algorithms for property testing, distribution testing, and graph problems. We will also examine query complexity as a main tool for proving lower bound on
the performance of sublinear time algorithms.
- Streaming Algorithms: Which problems can be solved in space smaller than what is needed to store the entire input? We will cover streaming algorithms for statistical estimation, numerical linear algebra, and graph problems. We will also examine communication complexity as a main tool for proving lower bound on the performance of streaming algorithms.
Along the way, we will learn about various key ideas such as probabilistic analysis of algorithms, compressed sensing, dimensionality reduction, sparsification, sketching, coresets, etc. that are used extensively in algorithm design as a whole and sublinear algorithms in particular.
The final grade for the course will be based on the following weights:
- 40% Problem sets
- 40% Project
- 20% Scribe notes
More details on the grading will be posted soon.
The schedule below the red line
and subject to change.
||Lecture notes and Remarks
|| Tue 09/07
|| Introduction, Course Policy, Probabilistic Analysis
|| Lecture Notes 1
|| Tue 09/14
|| Sublinear Time Algorithms: Connected Components, Average Degree
|| CRT05, F06, GR08, S15
|| Lecture Notes 2
|| Tue 09/21
|| Query Complexity: OR Function and Connectivity
|| Lecture Notes 3 -- Pset 1 release: [pdf]
|| Tue 09/28
|| Property Testing: Testing Sortedness
|| Lecture Notes 4
|| Tue 10/05
|| Distribution Testing: Uniformity Testing
|| Lecture Notes 5
|| Tue 10/12
|| Compressed Sensing and Sparse Recovery
|| BHRRS18, RSW18
|| Pset 1 due date
|| Tue 10/19
|| Streaming Algorithms: Frequency Moments Estimation
|| AMS96, BJKST02
|| Lecture Notes 7 -- Pset 2 release: [pdf]
|| Tue 10/26
|| Communication Complexity: Equality, Index
|| A96, T16
|| Lecture Notes 8
|| Tue 11/02
|| Streaming Algorithms: Regression via Dimensionality Reduction
|| Tue 11/09
|| Streaming Algorithms: Clustering via Coresets
|| GMMMO03, G09
|| Pset 2 due date
|| Tue 11/16
|| Graph Streaming Algorithms: Connectivity, Shortest Paths, Coloring
|| FKMSZ04, ACK19
|| Pset 3 release date: [pdf]
|| Tue 11/23
|| Graph Sketching: AGM Sketch for Connectivity
|| Tue 11/30
|| Multi-Pass Graph Streaming Algorithms
|| Tue 12/07
|| Student Presentations
|| Pset 3 due date
The project can take one of the following forms:
- Solve an open theory problem, formulate a new problem, or make some other contribution to the study of sublinear algorithms and/or lower bounds.
- Write a survey on a few related papers. A good approach is to first try to solve an open problem, which generally requires reading several background papers first, then switching to a survey
if the problem evades solution. However, even in this case, recording any partial progress is very important.
- Implement one or more of the sublinear algorithms studied in the course and compare their performance with standard classical algorithms for the problem on different datasets.
A list of project ideas (including open theory problems and some directions to explore) will be posted sometime in October. However, you are strongly encouraged to approach the Instructor with any project idea you have on sublinear algorithms before this date to pick as your own project -- note that your project does not need to be limited to the topics discussed in class as long as it is (loosly) related to sublinear algorithms.
Timetable for Projects:
- The projects are done in teams of two to three students. Each team needs to submit a single write-up for the project and the presentation time will be split evenly between the team members.
The team members receive the same grade for their project. You are encouraged to discuss your progress on your project with the Instructor and get feedback.
- You should learn a significant amount about your chosen project topic. This will involve closely and carefully reading literature on your specific project topic (likely to be a paper or two).
You will demonstrate this aspect of your project in the "Background" section of your project report, which should be a clear exposition of the topic in your own words. You can format this according to the scribe notes of the class. Basically, for this portion of the project you should turn in a polished and high-quality set of scribe notes, as if you had been the scribe for a lecture on your chosen topic.
- You should gain research experience in this area; i.e. make a serious effort to contribute to the state of knowledge on your project topic by (i) identifying an interesting open question or direction for future research related to your project topic; (ii) coming up with an approach to make progress; and (iii) working to carry out your approach. You will demonstrate this aspect of your project by explaining in detail what you did for (i), (ii) and (iii) in the rest of your project report and during your project presentation.
- Tuesday, November 2: For those of you who are planning to work on the suggested project ideas, send me an email listing your top three favorite projects. For the remainder of you, turn in a brief (1-2 paragraph) project proposal to me by email, describing your chosen topic, the sources you will use, and the portions of those sources that you will cover.
- Wednesday, November 3: For those of you who have sent their favorite projects from the list of suggested project ideas, I will send out an assignment of the project.
- Tuesday, November 30: Each group should send me an email containing the preliminary version of your presentation slides and topics you like to cover.
- Tuesday, December 7: 20- to 30-minutes presentations in the class for each group, in style of a conference presentation.
- Tuesday, December 14: Email me a five to ten page report on your project containing the following:
- A technical summary of the main prior work in the literature, including (1) the formal statement of their result, (2) a high level overview of their proof idea, (3) a lower level description of their proof including the main technical lemmas and claims, and (4) either complete proofs or proof sketches of these technical parts in your own words. This part of the document should be at a level that one could get the full picture of the result you are describing solely based on your write-up.
- A technical summary of your main contributions, including (1) a high level description of a concrete plan for approaching your problem and the type of result you hope to obtain, (2) a lower level set of lemmas and claims that if proven would imply your desired result, (3) either the proofs of the lemmas and claims from the previous part, or concrete and technical reasons for why your attempts failed to prove the desired technical results and any possible update that you came along the way that may allow you to bypass these barriers. For more experiment-based projects, steps (2) and (3) should instead be replaced by description of the exact methodology you used for implementing the algorithms and the experimental results you obtained. This document should be at a level that one (including yourself in near future!) could pickup the project from this part and continue making progress on the main problem.
There is no official textbook for this course and all required materials will be posted on this webpage.
The following is a list of some helpful supplementary materials (this list is by no means comprehensive):
- Background on Randomized Algorithms:
- Randomized Algorithms by Motwani and Raghavan;
- The Probabilistic Method by Alon and Spencer;
- Concentration of Measure for the Analysis of Randomised Algorithms by Dubhashi and Panconesi.
- Useful Books and Surveys:
- Related Courses:
- Previous Iterations of this Course:
And last but not the least, you should definitely check the List of Open Problems in Sublinear Algorithms
as one of the best places to get recent pointers on sublinear algorithms.
This is a (rather incomprehensive) list of the papers related to the topics discussed in the lectures. The list will be updated after each lecture to add the new relevant papers.
|| Farid M. Ablayev,
Lower Bounds for One-Way Probabilistic Communication Complexity and Their Application to Space Complexity. Theor. Comput. Sci. 1996, ICALP 1993.
|| Kook Jin Ahn, Sudipto Guha,
Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem. ICALP 2011.
|| Kook Jin Ahn, Sudipto Guha, Andrew McGregor,
Analyzing Graph Structure via Linear Measurements. SODA 2012.
|| Noga Alon, Yossi Matias, and Mario Szegedy,
The space complexity of approximating the frequency moments. STOC 1996.
|| Sanjeev Arora, Elad Hazan, Satyen Kale,
The Multiplicative Weights Update Method: a Meta-Algorithm and Applications. Theory of Computing 2012.
|| Sepehr Assadi, Yu Chen, Sanjeev Khanna,
Sublinear Algorithms for (Δ+1) Vertex Coloring. SODA 2019.
|| Tugkan Batu, Lance Fortnow, Ronitt Rubinfeld, Warren D. Smith, Patrick White,
Testing that distributions are close. FOCS 2000.
|| Ziv Bar-Yossef, T. S. Jayram, Ravi Kumar, D. Sivakumar, Luca Trevisan,
Counting Distinct Elements in a Data Stream.
|| Paul Beame, Sariel Har-Peled, Sivaramakrishnan Natarajan Ramamoorthy, Cyrus Rashtchian, Makrand Sinha,
Edge Estimation with Independent Set Oracles.
|| Harry Buhrman, Ronald de Wolf, Complexity Measures and Decision Tree Complexity: A Survey.
Theor. Comput. Sci., 2002.
|| Kenneth L. Clarkson, David P. Woodruff,
Numerical Linear Algebra in the Streaming Model. STOC 2009.
|| Bernard Chazelle, Ronitt Rubinfeld, Luca Trevisan,
Approximating the Minimum Spanning Tree Weight in Sublinear Time. SIAM Journal of Computing 2005, ICALP 2001.
|| Funda Ergün, Sampath Kannan, Ravi Kumar, Ronitt Rubinfeld, Mahesh Viswanathan,
Spot-Checkers. STOC 1998.
|| Uriel Feige, On Sums of Independent Random Variables with Unbounded Variance and Estimating the Average Degree in a Graph. SIAM Journal of Computing 2006, STOC 2004.
|| Joan Feigenbaum, Sampath Kannan, Andrew McGregor, Siddharth Suri, Jian Zhang On Graph Problems in a Semi-streaming Model. ICALP 2004.
|| Sudipto Guha, Tight Results for Clustering and Summarizing Data Streams. ICDT 2009.
|| Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani, Liadan O'Callaghan, Clustering Data Streams: Theory and Practice. IEEE Trans. Knowl. Data Eng. 2003, FOCS 2000.
|| Oded Goldreich, Dana Ron, Approximating average parameters of graphs. Random Structures and Algorithms 2006, APPROX-RANDOM 2006.
|| Michal Parnas, Dana Ron, Approximating the Minimum Vertex Cover in Sublinear Time and a Connection
to Distributed Algorithms. Theor. Comput. Sci., 2007.
|| Aviad Rubinstein, Tselil Schramm, S. Matthew Weinberg, Computing Exact Minimum Cuts Without Knowing the Graph. ITCS 2018.
|| Ronitt Rubinfeld, Gil Tamir, Shai Vardi, Ning Xie, Fast Local Computation Algorithms. I(T)CS 2011.
|| C. Seshadhri, A simpler sublinear algorithm for approximating the triangle count. available on arXiv.
|| Tim Roughgarden, Communication Complexity (for Algorithm Designers). Foundations and Trends in Theoretical Computer Science 2016.
You can download LaTeX for free here
. For the purpose of this course, you do not even need to install LaTeX and
can instead use an online LaTeX editor such as Overleaf
Two great introductory resources for LaTeX are A Short Introduction to LaTeX
by Allin Cottrell
(for general purpose LaTeX)
and LaTeX for Undergraduates
by Jim Hefferson
(for undergraduates mathematics)
accompanied by the following cheatsheet
(note that this document use "\( MATH \)" notation compared to the perhaps more
widely used "$ MATH $" -- both are completely fine in LaTeX).
You can also use this wonderful tool Detexify
by Daniel Kirsch
for finding the
LaTeX commands of a symbol (just draw the symbol!).
If you are interested in learning more about LaTeX (beyond what is needed for this course), check the Wikibook on LaTeX
the Wikibook on LaTeX for Mathematics
Rutgers Computer Science Department is committed to creating a consciously anti-racist, inclusive community that welcomes diversity in various dimensions
(e.g., race, national origin, gender, sexuality, disability status, class, or religious beliefs).
We will not tolerate micro-aggressions and discrimination that creates a hostile atmosphere in the class and/or threatens the well-being of our students.
We will continuously strive to create a safe learning environment that allows for the open exchange of ideas and cherished freedom of speech, while also
ensuring equitable opportunities and respect for all of us. Our goal is to maintain an environment where students, staff, and faculty can contribute without the fear
of ridicule or intolerant or offensive language.
If you witness or experience racism, discrimination micro-aggressions, or other offensive behavior, you are encouraged to
bring it to the attention to the undergraduate program director and/or the department chair. You can also report it to the
Bias Incident Reporting System
In order to protect the health and well-being of all members of the University community, masks must be worn by all persons on campus when in the presence of others (within six feet) and in buildings in non-private enclosed settings (e.g., common workspaces, workstations, meeting rooms, classrooms, etc.). Masks must be worn during class meetings; any student not wearing a mask will be asked to leave.
Masks should conform to CDC guidelines
and should completely cover the nose and mouth.
If you are feeling sick, or suspect you may have been exposed to COVID-19, do not come to the class. Arrangements will be made for students who are not able to attend class because of an illness or quarantine.