The design of learning-based compressed data structures

Name: The design of learning-based compressed data structures
Start: 2021-08-16T14:30:00+02:00
End: 2021-08-16T14:50:00+02:00
Location: Virtual

Abstract

We revisit some fundamental and ubiquitous problems in data structures design, such as predecessor search and rank/select primitives, by exploiting a reduction from the input data to the geometry of points in a Cartesian plane [Ao et al., VLDB 11].
We introduce techniques that discover, or “learn”, the regularities in this Cartesian plane and solve the problems above in an efficient algorithmic way [Ferragina and Vinciguerra, VLDB 20; Boffa et al., ALENEX 21]. Surprisingly, we show that it is possible to both learn the data regularities and retain the space-time complexity bounds of traditional data structures, which translates into robust performance in practice, as we show by experimenting with them on some large datasets.
Motivated by these results, we dig into the core components of these structures and present the first mathematically grounded answer to why learning-based compressed data structures can outperform their traditional counterparts [Ferragina et al., Theor. Comp. Sci. 2021].
We conclude by discussing the plethora of research opportunities that these new approaches to data structures design open up.

Date

16 Aug 2021 14:30 (UTC+02) — 14:50 (UTC+02)

Event

LADSIOS Workshop, co-located with VLDB 2021

Location

Virtual

The design of learning-based compressed data structures

Abstract

Giorgio Vinciguerra

Research Fellow (RTD-A)