A rigorous approach to design learned data structures


A newly proposed paradigm in data structures design consists in integrating machine-learning components that allow faster queries and more succinct (possibly compressed) space usage than traditional approaches thanks to a better adaptation to the regularities present in the input data. Initial research, particularly focusing on data indexing, has been largely empirical and heuristic, and it has been met with enthusiasm by software engineers but with mild interest among theoreticians. In this talk, we discuss some recent results for the ubiquitous problems of data compression and indexing that bring formality to this novel learning-based paradigm. We show the existence of learning-based data structures that achieve, at the same time, improved practical efficiency over known engineered solutions (sometimes of orders of magnitude) and provably good theoretical worst-case guarantees.

10 May 2022 16:30 (UTC+02) — 17:00 (UTC+02)