Giorgio Vinciguerra

Research Fellow (RTD-A)

Università di Pisa

I am a Research Fellow (academic rank in Italy: RTD-A) at the Department of Computer Science of the University of Pisa since January 2023.

My research interests include compact data structures, data compression, and algorithm engineering, with a focus on the so-called learned data structures, that is, data structures that exploit machine learning tools to uncover new regularities in the input data and achieve significantly improved space-time trade-offs over traditional ones.

I obtained my PhD from the University of Pisa in February 2022 with a thesis on Learning-based compressed data structures that was awarded the Best PhD thesis in Theoretical Computer Science by the Italian Chapter of the EATCS. Before my current position, I was a postdoc (2022) and PhD student (2018–21) at the University of Pisa. I was a visiting researcher at KTH Royal Institute of Technology (2024) and at Harvard University (2020).

In July 2025, I earned the Italian National Scientific Habilitation as Associate Professor in Computer Science.

Results of my research, including my software libraries, have found applications in database systems, information retrieval systems, and bioinformatics tools. Furthermore, I was granted a US and an Italian patent, owned by the University of Pisa.

My research is supported by the EU-funded project SoBigData.it. In the past, I was supported by the SoBigData++ and Multicriteria data structures projects.

Publications

Filter

Authors are listed alphabetically for most papers

Md. Hasin Abrar, Paul Medvedev, Giorgio Vinciguerra (2025). Efficiency of learned indexes on genome spectra. ESA.

PDF Cite Code

Antonio Boffa, Roberto Di Cosmo, Paolo Ferragina, Andrea Guerra, Giovanni Manzini, Giorgio Vinciguerra, Stefano Zacchiroli (2025). On the compressibility of large-scale source code datasets. J. Syst. Softw..

PDF Cite Code DOI

Andrea Guerra*, Giorgio Vinciguerra*, Antonio Boffa, Paolo Ferragina (2025). Learned compression of nonlinear time series with random access. ICDE.

PDF Cite Code DOI

Paolo Ferragina, Mariagiovanna Rotundo, Giorgio Vinciguerra (2025). Two-level massive string dictionaries. Inf. Syst..

PDF Cite Code DOI

Marco Costa, Paolo Ferragina, Giorgio Vinciguerra (2024). Grafite: taming adversarial queries with optimal range filters. PACMMOD.

PDF Cite Code Poster Slides DOI

Antonio Boffa, Paolo Ferragina, Francesco Tosoni, Giorgio Vinciguerra (2024). CoCo-trie: data-aware compression and indexing of strings. Inf. Syst..

PDF Cite Code DOI

Paolo Ferragina, Mariagiovanna Rotundo, Giorgio Vinciguerra (2023). Engineering a textbook approach to index massive string dictionaries. SPIRE.

PDF Cite Code DOI

Paolo Ferragina, Marco Frasca, Giosuè Cataldo Marinò, Giorgio Vinciguerra (2023). On nonlinear learned string indexing. IEEE Access.

PDF Cite Code Dataset DOI

Paolo Ferragina, Hans-Peter Lehmann, Peter Sanders, Giorgio Vinciguerra (2023). Learned monotone minimal perfect hashing. ESA.

PDF Cite Code Slides DOI

Paolo Ferragina, Giovanni Manzini, Giorgio Vinciguerra (2022). Compressing and querying integer dictionaries under linearities and repetitions. IEEE Access.

PDF Cite DOI Block-$ε$ tree code LZ$_ε$ code

Antonio Boffa, Paolo Ferragina, Francesco Tosoni, Giorgio Vinciguerra (2022). Compressed string dictionaries via data-aware subtrie compaction. SPIRE.

PDF Cite Code DOI

Antonio Boffa, Paolo Ferragina, Giorgio Vinciguerra (2022). A learned approach to design compressed rank/select data structures. ACM Trans. Algorithms.

PDF Cite Code DOI Experiments code & datasets

Giorgio Vinciguerra (2022). Learning-based compressed data structures. Ph.D. thesis.

PDF Cite

Paolo Ferragina, Giovanni Manzini, Giorgio Vinciguerra (2021). Repetition- and linearity-aware rank/select dictionaries. ISAAC.

PDF Cite Code Slides DOI

Paolo Ferragina, Fabrizio Lillo, Giorgio Vinciguerra (2021). On the performance of learned data structures. Theor. Comput. Sci..

PDF Cite Code DOI

Antonio Boffa, Paolo Ferragina, Giorgio Vinciguerra (2021). A “learned” approach to quicken and compress rank/select dictionaries. ALENEX.

PDF Cite Code DOI Experiments code & datasets

Paolo Ferragina, Fabrizio Lillo, Giorgio Vinciguerra (2020). Why are learned indexes so effective?. ICML.

PDF Cite Code Slides Video

Paolo Ferragina, Giorgio Vinciguerra (2020). The PGM-index: a fully-dynamic compressed learned index with provable worst-case bounds. PVLDB.

PDF Cite Code Project Slides Video DOI

Paolo Ferragina, Giorgio Vinciguerra (2020). Learned data structures. Recent Trends in Learning From Data (Springer).

PDF Cite DOI

Awards

Best 2022 PhD thesis in Theoretical Computer Science by the Italian Chapter of the EATCS
WSDM 25 Outstanding Reviewer Award. Awarded to only 10/466 PC members

Service

Program committees:
WSDM 2026, CIKM 2025, WSDM 2025, WSDM 2024, AIME 2024, BIBM 2024, WSDM 2023, BIBM 2023

Reproducibility committees:
SIGMOD 2025, SIGMOD 2024

Organising committees:
DSB 2025, SPIRE 2023

Journal reviewer:
IEEE Trans. Knowl. Data Eng., IEEE Trans. Cloud Comput., IEEE Trans. Comput.-Aided Des. Integr. Circuits Syst., Inf. Syst., VLDB J., J. Inf. Secur. Appl., Software: Pract. Exp., Neurocomputing, PLOS ONE

Conference reviewer:
SIGIR 2024, SPIRE 2023, SAND 2023, ALENEX 2022, DCC 2022, LATIN 2022

Talks

Learned compression of nonlinear time series with random access

4 Mar 2025 14:05 (UTC+01) — 14:35 (UTC+01) Second Workshop on ML4Sys and Sys4ML

Time for learned data structures

11 Sep 2024 15:00 (UTC-04) — 16:00 (UTC-04) Weekly Wednesday Wartik Genomics Lecture Series

Learned monotone minimal perfect hashing

15 Aug 2023 14:00 (UTC+02) — 15:00 (UTC+02) BARC Talk

Advances in data-aware compressed-indexing schemes for integer and string keys

7 Mar 2023 14:40 (UTC+01) — 15:15 (UTC+01) A Tutorial Workshop on ML for Systems and Systems for ML, co-located with BTW 2023

Learning-based approaches to compressed data structures design

22 Aug 2022 14:30 (UTC+02) — 15:00 (UTC+02) IGAFIT Workshop for Algorithms Postdocs in Europe (AlgPiE by IGAFIT 2022)

See all talks

Teaching & Supervision

Teacher:

Laboratorio 1 (3/12 ECTS). BSc in Computer Science, University of Pisa. AY 2024/25.
Programmazione e Algoritmica (3/15 ECTS). BSc in Computer Science, University of Pisa. AYs 2024/25, 2023/24.
Information Retrieval (3/6 ECTS). MSc in Computer Science, University of Pisa. AY 2023/24.

Teaching assistant:

Algorithm Engineering. MSc in Computer Science, University of Pisa. AYs 2022/23, 2021/22, 2020/21.
Algoritmica e Laboratorio. BSc in Computer Science, University of Pisa. AY 2018/19.

Supervised and co-supervised students:

MSc in Computer Science, University of Pisa: Marco Costa (2022), Mariagiovanna Rotundo (2022), Antonio Boffa (2020), Lorenzo De Santis (2019).
BSc in Computer Science, University of Pisa: Giuliano Gorgone (2025), Alessio Russo (2020).