Publications

Here you can find a consolidated (a.k.a. slowly updated) list of my publications. A frequently updated (and possibly noisy) list of works is available on my Google Scholar profile.

Please find below a short list of highlight publications for my recent activity.

Show all
 Serramazza, Davide Italo;  Bacciu, Davide
Learning image captioning as a structured transduction task Conference 
Proceedings of the 23rd International Conference on Engineering Applications of Neural Networks (EANN 2022), vol. 1600, Communications in Computer and Information Science  Springer, 2022.
Abstract | Links | BibTeX
@conference{Serramazza2022,

title = {Learning image captioning as a structured transduction task},

author = {Davide Italo Serramazza and Davide Bacciu},

doi = {doi.org/10.1007/978-3-031-08223-8_20},

year  = {2022},

date = {2022-06-20},

urldate = {2022-06-20},

booktitle = {Proceedings of the 23rd International Conference on Engineering Applications of Neural Networks (EANN 2022)},

volume = {1600},

pages = {235–246},

publisher = {Springer},

series = {Communications in Computer and Information Science },

abstract = {Image captioning is a task typically approached by deep encoder-decoder architectures, where the encoder component works on a flat representation of the image while the decoder considers a sequential representation of natural language sentences. As such, these encoder-decoder architectures implement a simple and very specific form of structured transduction, that is a generalization of a predictive problem where the input data and output predictions might have substantially different structures and topologies. In this paper, we explore a generalization of such an approach by addressing the problem as a general structured transduction problem. In particular, we provide a framework that allows considering input and output information with a tree-structured representation. This allows taking into account the hierarchical nature underlying both images and sentences. To this end, we introduce an approach to generate tree-structured representations from images along with an autoencoder working with this kind of data. We empirically assess our approach on both synthetic and realistic tasks.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Image captioning is a task typically approached by deep encoder-decoder architectures, where the encoder component works on a flat representation of the image while the decoder considers a sequential representation of natural language sentences. As such, these encoder-decoder architectures implement a simple and very specific form of structured transduction, that is a generalization of a predictive problem where the input data and output predictions might have substantially different structures and topologies. In this paper, we explore a generalization of such an approach by addressing the problem as a general structured transduction problem. In particular, we provide a framework that allows considering input and output information with a tree-structured representation. This allows taking into account the hierarchical nature underlying both images and sentences. To this end, we introduce an approach to generate tree-structured representations from images along with an autoencoder working with this kind of data. We empirically assess our approach on both synthetic and realistic tasks.
Close
doi:doi.org/10.1007/978-3-031-08223-8_20
Close

Davide Bacciu – Homepage

Full Professor – Dipartimento di Informatica, Università di Pisa