Laboratory on Algorithms for Big Data a.a. 2016/17

Teacher: Rossano Venturini

CFU: 6

Period: First semester

Language: English

Lectures schedule: Tuesday 1416 (A1) and Thursday 1618 (A1).

Question time: After lectures or by appointment.

Allowed Degree: Master degree in Computer Science, or Computer Science and Networking, or Business Informatics.
Goals and opportunities
The course introduces advanced algorithms and data structures of practical interest. The students will encouraged in their projects to implement, test and compare these techniques on real datasets.
The course will provide the opportunity of

facing with difficult algorithmic problems of practical interest involving big data;

evaluating the impact of efficient algorithmic solutions in the design of software managing big data;

implementing advanced software by using powerful and sophisticated libraries;

getting in touch with some companies for internships, scholarships, or thesis proposals.
Exam
The exam consists of two parts: Project and its presentation 70% – Written/oral test (or a serious attempt to compete in any online (algorithmic) contest) 30%.
Students can attend the written/oral test before the presentation/development of the project.
Background
If you wish to refresh your mind on basic Algorithms and Data Structures, I suggest you to look at the wellknown book Introduction to Algorithms by CormenLeisersonRivestStein, third edition.
References
 Introduction to Algorithms, 3rd Edition, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein, The MIT Press, 2009 (Amazon) [CCLR]
 Algorithms, 4th Edition, Robert Sedgewick, Kevin Wayne, AddisonWesley Professional, 2011 (Amazon) [RS]
 Algorithms, Sanjoy Dasgupta, Christos Papadimitriou, Umesh Vazirani, McGrawHill, 2006. (Amazon) [DPZ]
 Probability and Computing: Randomized Algorithms and Probabilistic Analysis, Michael Mitzenmacher and Eli Upfal, Cambridge University Press, 2005 (Amazon) [MU]
 Programming Challenges: The Programming Contest Training Manual, Steven S. Skiena, Miguel A. Revilla, Springer, 2003 (Amazon)
 The C++ Standard Library: A Tutorial and Reference (2nd Edition), Nicolai M. Josuttis, AddisonWesley Professional, 2012 (Amazon)
 The C++ Programming Language, 4th Edition, Bjarne Stroustrup, AddisonWesley Professional, 2013 (Amazon)
Interesting links
 Erik D Demaine, Srini Devadas, Nancy Ann Lynch, “MIT Design and Analysis of Algorithms”, MIT Algorithm course which includes video lectures
 Erik D Demaine, “Advanced data structures”, includes video lectures
 Agner Fog, “Optimizing software in C++”
 Links to several libraries and useful resources
Project: Time series indexing
Description and Datasets: small and large.
Lectures
Most of the content of the course will be covered by notes, sometime we’ll use parts of papers/books.
Date  Lecture  References 

20/09/2016  Introduction  Slides 
22/09/2016  Efficiency and profiling  Slides and references therein. Video CppCon 2016. Video CppCon 2015. 
27/09/2016  Dictionary problem: Hashing.  Slides. Hashing Chapter 11 [CCLR]. 
29/09/2016  Dictionary problem: Cuckoo hashing and perfect hashing.  Slides. Perfect Hashing Chapter 11.5 [CCLR]. Cuckoo Hashing. 
04/10/2016  Dictionary problem: Minimal perfect hashing.  Slides. C++ implementation of Minimal perfect hash functions. 
06/10/2016  Succinct data structures: rank/select and EliasFano representation.  Slides. C++ code. 
11/10/2016  Autocompletion system.  Slides. SDSL library. 
13/10/2016  Succinct tree representations: LOUDS, BP, and DFUDS.  Slides. 
18/10/2016  Range Maximum Query and TopK retrieval.  Slides. 
20/10/2016  Discussion about the project.  
08/11/2016  Overview on SIMD vector extensions.  Slides. 
10/11/2016  Problem solving: Sorting I.  Problems: Towers, Little Robber Girl’s Zoo, Finding Team Member, and Megacity. 
15/11/2016  Problem solving: Sorting II.  Problems: Find Pair, Two Heaps, and Inversion Count. Solutions. 
17/11/2016  Problem solving: (Static) Prefix sum.  Problems: Ilya and Queries, Alice, Bob and Chocolate, Number of Ways, and Little Girl and Maximum Sum. Solutions. 
22/11/2016  Problem solving: (Static) Prefix sum II.  
24/11/2016  Problem solving: Prefix sum I.  Fenwick tree: description, tutorial, video, and code. Problem: Update the array. 
29/11/2016  Problem solving: Prefix sum II.  Problems: Pashmak and Parmida’s problem, Nested Segments, and Propagating tree. 
01/12/2016  Problem solving: Prefix sum III.  Solutions 
06/12/2016  Problem solving: Graph algorithms.  [CCLR] Chapter 22. Problems: Fox and Names and Learning Languages. 
07/12/2016  Problem solving: Dynamic Programming I.  [CCLR] Chapter 15. Problems: 01 Knapsack Problem, IWGBS – 0110SS, and Woodcutters. 
13/12/2016  Problem solving: Dynamic Programming II.  Unweighted shortest path and unweighted longest simple path on DAG, longest increasing subsequence and largest independent set on tree. 