Department of Computer Science

University of Pisa

The logotype of unipi with the cherub.

Francesco Tosoni, PhD

Francesco Tosoni
Photo by Studio Schloen, Cologne.

Acube Lab


 L.go B. Pontecorvo 3, 56127 Pisa PI, Italy

  Polo Fibonacci, Building C, second floor, room 300

  francesco◦tosoni🐌di◦unipi◦it


I am an algorithmist, primarily specialising in lossless data compression. Since July 2024, I have been working on optimising the compression and efficient indexing of large code archives in collaboration with the Software Heritage team, including Roberto di Cosmo and Stefano Zacchiroli. In my doctoral thesis, I studied compressed formats for matrices and trie structures; subsequently, I explored various sparse matrix formats that support matrix-vector multiplications (SpMV) in the compressed domain, with a focus on energy efficiency.

For those familiar with the IPA, my name is pronounced: [fraŋ'ʧesko to'zoːni].

Education

I earned a PhD in Computer Science Click here to download a PDF document. from the University of Pisa, under the supervision of Professors P. Ferragina and G. Manzini. My doctoral dissertation, titled Computation-friendly Compression of Matrices and Tries, focused on efficient data compression techniques. Since 2019, I have been a member of the Acube Laboratory (A³, Advanced Algorithms and Applications), directed by Professor P. Ferragina.

My research interests include lossless data compression, string indexing and stringology, and big data analytics.

I obtained a BSc in Computer and Electronic Engineering Click here to download a PDF document. from the University of Perugia. I then continued my studies at the University of Pisa, earning an MSc in Computer Science and Networking Click here to download a PDF document. in 2020, as part of a joint programme with the Sant’Anna School of Advanced Studies. My MSc thesis, Algorithms and Data Structures for Efficient Ride-Sharing Platforms, was awarded the Con.Scienze 2020 Best Thesis Award.

In 2020, I was awarded a scholarship and research grant on "Algorithms and Data Structures for Urban Mobility Platforms" at the University of Pisa. That same year, I obtained the qualification to register as a chartered engineer (Section A, Information Engineering).

From 8 September to 20 December 2022, I was a visiting researcher at Professor Gonzalo Navarro's laboratory at the University of Chile in Santiago.

Publications

2024

  • F. Tosoni, P. Bille, V. Brunacci, A. De Angelis, P. Ferragina, and G. Manzini. Toward Greener Matrix Operations by Lossless Compressed Formats, arXiv, doi: 10.48550/arXiv.2409.18620.

2023

  • A. Boffa, P. Ferragina, F. Tosoni, and G. Vinciguerra. CoCo-trie: Data-aware compression and indexing of strings, Information Systems (IS), doi: 10.1016/j.is.2023.102316.

2022

  • A. Boffa, P. Ferragina, F. Tosoni, and G. Vinciguerra. Compressed String Dictionaries via Data-Aware Subtrie Compaction, 29th International Symposium on String Processing and Information Retrieval (SPIRE 2022), doi: 10.1007/978-3-031-20643-6_17.
  • P. Ferragina, G. Manzini, T. Gagie, D. Köppl, G. Navarro, M. Striani, and F. Tosoni. Improving Matrix-vector Multiplication via Lossless Grammar-Compressed Matrices, Proceedings of the VLDB Endowment (PVLDB), 15(10), 2175 - 2187, 2022, doi: 10.14778/3547305.3547321

2021

  • F. Tosoni, P. Ferragina, A. Marino, G. Resta, and P. Santi, Locality Filtering for Efficient Ride Sharing Platforms, IEEE Transactions on Intelligent Transportation Systems (IEEE TITS), doi: 10.1109/TITS.2021.3072830.

Participations at conferences and workshops

2025

  • Software Heritage Kickoff Workshops, 28 January 2025, Paris, France. (web site, slides Click here to download a PDF document.)

2024

  • From Software Heritage to Code Commons: A Vision for Transparent and Responsible AI in Code-Based Model Training, 12 December 2024, Pilo Boyl Palace, Sant'Anna School of Advanced Studies, Pisa, Italy. (sito)

2022

  • SPIRE ’22, 29th International Symposium on String Processing and Information Retrieval, 8-10 November 2022, Concepcion, Chile. (web site)
  • VLDB ’22, 48th International Conference on Very Large Databases, 5-9 September 2022, Sydney, Australia. [virtual] (web site)

Seminars and disseminations

2024

  • Toward Greener Matrix Operations by Lossless Compressed Formats, 7 November 2024, REGINIDEX research group. Venice, Italy. (slides Click here to download a PDF document.)
  • Toward Greener Matrix Operations by Lossless Compressed Formats, 21 October 2024, Efficient Machine Learning Reading Group. [virtual] (web site, YouTube, slides Click here to download a PDF document.)

2022

  • Improving Matrix-Vector Multiplication via Lossless Grammar-Compressed Matrices, 16 September 2022, Complex Science Hub (CSH), Vienna, Austria. [virtual] (web site)
last update: 29th January '25
undefined undefined