# Biography

I’m a postdoctoral researcher in Computer Science at the University of Pisa, currently a member of the A³ Lab led by Prof. Paolo Ferragina.

In 2022, I received a Ph.D. in Computer Science from the University of Pisa with a thesis on Learning-based compressed data structures, that is, data structures that achieve new space-time trade-offs compared to traditional solutions by learning, in a rigorous and efficient algorithmic way, the regularities in the input data with tools from machine learning and computational geometry. The thesis was awarded the Best Ph.D. thesis in Theoretical Computer Science by the Italian Chapter of the European Association for Theoretical Computer Science.

My research falls under a national project named Multicriteria data structures funded by the Italian Ministry of University and Research and under the European H2020 project named SoBigData++.

Interests
• Compact data structures
• Algorithm engineering
• Data compression
Education
• Ph.D. in Computer Science, 2018–21

University of Pisa

• M.Sc. in Computer Science, 2016–18

University of Pisa

• B.Sc. in Computer Science, 2013–16

University of Pisa

# Publications

(2022). Compressed string dictionaries via data-aware subtrie compaction. SPIRE.

(2022). A learned approach to design compressed rank/select data structures. ACM Trans. Algorithms.

(2022). Learning-based compressed data structures. Ph.D. thesis.

(2021). Repetition- and linearity-aware rank/select dictionaries. ISAAC.

(2021). On the performance of learned data structures. Theor. Comput. Sci..

(2021). A “learned” approach to quicken and compress rank/select dictionaries. ALENEX.

(2020). Why are learned indexes so effective?. ICML.

(2020). Learned data structures. Recent Trends in Learning From Data (Springer).

# Projects

#### LZ$\phantom{}_{\boldsymbol\varepsilon}$

Compressed rank/select dictionary based on Lempel-Ziv and LA-vector compression.

#### LZ-End

Implementation of two LZ-End parsing algorithms.

#### PrefixPGM

Proof-of-concept extension of the PGM-index to support fixed-length strings.

#### RearCodedArray

Compressed string dictionary based on rear-coding.

#### Block-$\boldsymbol\varepsilon$ tree

Compressed rank/select dictionary exploiting approximate linearity and repetitiveness.

#### LA-vector

Compressed bitvector/container supporting efficient random access and rank queries.

#### PyGM

Python library of sorted containers with state-of-the-art query performance and compressed memory usage.

#### PGM-index

Data structure enabling fast searches in arrays of billions of items using orders of magnitude less space than traditional indexes.

#### CSS-tree

C++11 implementation of the Cache Sensitive Search tree.

#### NN Weaver

Python library to build and train feedforward neural networks, with hyperparameters tuning capabilities.

# Talks

Learning-based approaches to compressed data structures design
Learning-based approaches to compressed data structures design
A rigorous approach to design learned data structures
The design of learning-based compressed data structures
A tutorial on learning-based compressed data structures

# Teaching & Supervision

Teaching assistant for:

I co-supervised:

• Antonio Boffa, Spreading the learned approach to succinct data structures, MSc in Computer Science - ICT, 2020.
• Alessio Russo, Learned index per i db del futuro, BSc in Computer Science, 2020.
• Lorenzo De Santis, On non-linear approaches for piecewise geometric model, MSc in Computer Science - AI, 2019.

Knowledge is like a sphere; the greater its volume, the larger its contact with the unknown.

― Blaise Pascal