Compression and Indexing
Massive Data Sets


November 2002




General Informations:




Goal: Introduce the principles and the computational problems arising in the design of algorithms and data structures for the processing of massive data sets. In particular we will focus on string-matching problems on large text collections and discuss basic and advanced compression algorithms and indexing data structures for their processing, querying and retrieval.



Exam:   Lecture notes or software project




Book references:


[MG]  Managing Gigabytes. I.H. Witten e A. Moffat e T.C. Bell. Morgan Kaufmann, 1999.


[S]   Data compression: The complete reference. D. Salomon. Springer, II edition, 2000.


(…. articles brought to class by the teacher....)






List of the arguments:  

1. Introduction

2. Text compression

2. Text indexing and searching