Time & Location

Tuesdays 16:00-19:00, Sherman 002


The course introduces key tools and techniques that turn data into information.

  • Streamed and distributed data: Summary structures (sketches) for frequency statistics, heavy hitters, weighted sampling, similarity estimation. Sample-based (Minhash) sketches, linear random projections.
  • Mining and Learning from Graphs: Centrality (spectral,distance-based), similarity, influence, distance sketches, greedy influence maximization, (personalized) page rank, semi-supervised learning
  • Metric data and matrices: Dimensionality reductions: The JL transform, metric embeddings, principal components analysis, nonlinear matrix factorization. Nearest neighbor search and locality-sensitive hashing, clustering
  • Data privacy


Completion of all first-year courses (linear-algebra, calculus, probability), programming, and data structures and algorithms. Open to graduate students and third-year undergraduates.

Office hours:

By appointment { edco, fiat, haimk } AT

Course Workload and Grading

  • 5 Problem sets 30%

  • Final exam 70%

As always, one has to pass the exam for a passing grade in the course.

Discussion Forum

Initial login is with your TAU user name and password, then register to create an account.
Please post questions/inquiries that are of general interest to students

Previous offering

Last modified: Sat Oct 21 04:28:15 IDT 2017