Publications

Here you can find a consolidated (a.k.a. slowly updated) list of my publications. A frequently updated (and possibly noisy) list of works is available on my Google Scholar profile.
Please find below a short list of highlight publications for my recent activity.
Show all
 Ninniri, Matteo;  Podda, Marco;  Bacciu, Davide
Classifier-free graph diffusion for molecular property targeting Workshop 
4th workshop on Graphs and more Complex structures for Learning and Reasoning (GCLR) at AAAI 2024, 2024.
Abstract | Links | BibTeX
@workshop{Ninniri2024,

title = {Classifier-free graph diffusion for molecular property targeting},

author = {Matteo Ninniri and Marco Podda and Davide Bacciu},

url = {https://arxiv.org/abs/2312.17397, Arxiv},

year  = {2024},

date = {2024-02-27},

booktitle = {4th workshop on Graphs and more Complex structures for Learning and Reasoning (GCLR) at AAAI 2024},

abstract = {This work focuses on the task of property targeting: that is, generating molecules conditioned on target chemical properties to expedite candidate screening for novel drug and materials development. DiGress is a recent diffusion model for molecular graphs whose distinctive feature is allowing property targeting through classifier-based (CB) guidance. While CB guidance may work to generate molecular-like graphs, we hint at the fact that its assumptions apply poorly to the chemical domain. Based on this insight we propose a classifier-free DiGress (FreeGress), which works by directly injecting the conditioning information into the training process. CF guidance is convenient given its less stringent assumptions and since it does not require to train an auxiliary property regressor, thus halving the number of trainable parameters in the model. We empirically show that our model yields up to 79% improvement in Mean Absolute Error with respect to DiGress on property targeting tasks on QM9 and ZINC-250k benchmarks. As an additional contribution, we propose a simple yet powerful approach to improve chemical validity of generated samples, based on the observation that certain chemical properties such as molecular weight correlate with the number of atoms in molecules. },

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
This work focuses on the task of property targeting: that is, generating molecules conditioned on target chemical properties to expedite candidate screening for novel drug and materials development. DiGress is a recent diffusion model for molecular graphs whose distinctive feature is allowing property targeting through classifier-based (CB) guidance. While CB guidance may work to generate molecular-like graphs, we hint at the fact that its assumptions apply poorly to the chemical domain. Based on this insight we propose a classifier-free DiGress (FreeGress), which works by directly injecting the conditioning information into the training process. CF guidance is convenient given its less stringent assumptions and since it does not require to train an auxiliary property regressor, thus halving the number of trainable parameters in the model. We empirically show that our model yields up to 79% improvement in Mean Absolute Error with respect to DiGress on property targeting tasks on QM9 and ZINC-250k benchmarks. As an additional contribution, we propose a simple yet powerful approach to improve chemical validity of generated samples, based on the observation that certain chemical properties such as molecular weight correlate with the number of atoms in molecules. 
Close
Arxiv
Close
 Podda, Marco;  Bacciu, Davide;  Micheli, Alessio
A Deep Generative Model for Fragment-Based Molecule Generation Conference 
Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020) , 2020.
Abstract | Links | BibTeX
@conference{aistats2020,

title = {A Deep Generative Model for Fragment-Based Molecule Generation},

author = {Marco Podda and Davide Bacciu and Alessio Micheli},

url = {https://arxiv.org/abs/2002.12826},

year  = {2020},

date = {2020-06-03},

urldate = {2020-06-03},

booktitle = {Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020) },

abstract = {Molecule generation is a challenging open problem in cheminformatics. Currently, deep generative approaches addressing the challenge belong to two broad categories, differing in how molecules are represented. One approach encodes molecular graphs as strings of text, and learn their corresponding character-based language model. Another, more expressive, approach operates directly on the molecular graph. In this work, we address two limitations of the former: generation of invalid or duplicate molecules. To improve validity rates, we develop a language model for small molecular substructures called fragments, loosely inspired by the well-known paradigm of Fragment-Based Drug Design. In other words, we generate molecules fragment by fragment, instead of atom by atom. To improve uniqueness rates, we present a frequency-based clustering strategy that helps to generate molecules with infrequent fragments. We show experimentally that our model largely outperforms other language model-based competitors, reaching state-of-the-art performances typical of graph-based approaches. Moreover, generated molecules display molecular properties similar to those in the training sample, even in absence of explicit task-specific supervision.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Molecule generation is a challenging open problem in cheminformatics. Currently, deep generative approaches addressing the challenge belong to two broad categories, differing in how molecules are represented. One approach encodes molecular graphs as strings of text, and learn their corresponding character-based language model. Another, more expressive, approach operates directly on the molecular graph. In this work, we address two limitations of the former: generation of invalid or duplicate molecules. To improve validity rates, we develop a language model for small molecular substructures called fragments, loosely inspired by the well-known paradigm of Fragment-Based Drug Design. In other words, we generate molecules fragment by fragment, instead of atom by atom. To improve uniqueness rates, we present a frequency-based clustering strategy that helps to generate molecules with infrequent fragments. We show experimentally that our model largely outperforms other language model-based competitors, reaching state-of-the-art performances typical of graph-based approaches. Moreover, generated molecules display molecular properties similar to those in the training sample, even in absence of explicit task-specific supervision.
Close
https://arxiv.org/abs/2002.12826
Close
 Bacciu, Davide;  Micheli, Alessio
Deep Learning for Graphs Book Chapter 
In:  Oneto, Luca;  Navarin, Nicolo;  Sperduti, Alessandro;  Anguita, Davide (Ed.): Recent Trends in Learning From Data: Tutorials from the INNS Big Data and Deep Learning Conference (INNSBDDL2019), vol. 896, pp. 99-127, Springer International Publishing, 2020, ISBN: 978-3-030-43883-8.
Abstract | Links | BibTeX
@inbook{graphsBDDL2020,

title = {Deep Learning for Graphs},

author = {Davide Bacciu and Alessio Micheli},

editor = {Luca Oneto and Nicolo Navarin and Alessandro Sperduti and Davide Anguita



},

url = {https://link.springer.com/chapter/10.1007/978-3-030-43883-8_5},

doi = {10.1007/978-3-030-43883-8_5},

isbn = {978-3-030-43883-8},

year  = {2020},

date = {2020-04-04},

booktitle = {Recent Trends in Learning From Data: Tutorials from the INNS Big Data and Deep Learning Conference (INNSBDDL2019)},

volume = {896},

pages = {99-127},

publisher = {Springer International Publishing},

series = {Studies in Computational Intelligence Series},

abstract = {We introduce an overview of methods for learning in structured domains covering foundational works developed within the last twenty years to deal with a whole range of complex data representations, including hierarchical structures, graphs and networks, and giving special attention to recent deep learning models for graphs. While we provide a general introduction to the field, we explicitly focus on the neural network paradigm showing how, across the years, these models have been extended to the adaptive processing of incrementally more complex classes of structured data. The ultimate aim is to  show how to cope with the fundamental issue of learning adaptive representations for samples with varying size and topology.},

keywords = {},

pubstate = {published},

tppubtype = {inbook}

}

Close
We introduce an overview of methods for learning in structured domains covering foundational works developed within the last twenty years to deal with a whole range of complex data representations, including hierarchical structures, graphs and networks, and giving special attention to recent deep learning models for graphs. While we provide a general introduction to the field, we explicitly focus on the neural network paradigm showing how, across the years, these models have been extended to the adaptive processing of incrementally more complex classes of structured data. The ultimate aim is to  show how to cope with the fundamental issue of learning adaptive representations for samples with varying size and topology.
Close
https://link.springer.com/chapter/10.1007/978-3-030-43883-8_5
doi:10.1007/978-3-030-43883-8_5
Close
Davide Bacciu – Homepage

Full Professor – Dipartimento di Informatica, Università di Pisa