Publications

Here you can find a consolidated (a.k.a. slowly updated) list of my publications. A frequently updated (and possibly noisy) list of works is available on my Google Scholar profile.
Please find below a short list of highlight publications for my recent activity.
Show all
 Lepri, Marco;  Bacciu, Davide;  Santina, Cosimo Della
Neural Autoencoder-Based Structure-Preserving Model Order Reduction and Control Design for High-Dimensional Physical Systems Journal Article 
In: IEEE Control Systems Letters, 2023.
BibTeX
@article{lepri2023neural,

title = {Neural Autoencoder-Based Structure-Preserving Model Order Reduction and Control Design for High-Dimensional Physical Systems},

author = {Marco Lepri and Davide Bacciu and Cosimo Della Santina},

year  = {2023},

date = {2023-12-21},

urldate = {2023-01-01},

journal = {IEEE Control Systems Letters},

publisher = {IEEE},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
 Carta, Antonio;  Cossu, Andrea;  Lomonaco, Vincenzo;  Bacciu, Davide
Ex-Model: Continual Learning from a Stream of Trained Models Conference 
Proceedings of the CVPR 2022 Workshop on Continual Learning , IEEE 2022.
Abstract | Links | BibTeX
@conference{carta2021ex,

title = {Ex-Model: Continual Learning from a Stream of Trained Models},

author = {Antonio Carta and Andrea Cossu and Vincenzo Lomonaco and Davide Bacciu},

url = {https://arxiv.org/pdf/2112.06511.pdf, Arxiv},

year  = {2022},

date = {2022-06-20},

urldate = {2022-06-20},

booktitle = {Proceedings of the CVPR 2022 Workshop on Continual Learning },

journal = {arXiv preprint arXiv:2112.06511},

pages = {3790-3799},

organization = {IEEE},

abstract = {Learning continually from non-stationary data streams is a challenging research topic of growing popularity in the last few years. Being able to learn, adapt, and generalize continually in an efficient, effective, and scalable way is fundamental for a sustainable development of Artificial Intelligent systems. However, an agent-centric view of continual learning requires learning directly from raw data, which limits the interaction between independent agents, the efficiency, and the privacy of current approaches. Instead, we argue that continual learning systems should exploit the availability of compressed information in the form of trained models. In this paper, we introduce and formalize a new paradigm named "Ex-Model Continual Learning" (ExML), where an agent learns from a sequence of previously trained models instead of raw data. We further contribute with three ex-model continual learning algorithms and an empirical setting comprising three datasets (MNIST, CIFAR-10 and CORe50), and eight scenarios, where the proposed algorithms are extensively tested. Finally, we highlight the peculiarities of the ex-model paradigm and we point out interesting future research directions. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Learning continually from non-stationary data streams is a challenging research topic of growing popularity in the last few years. Being able to learn, adapt, and generalize continually in an efficient, effective, and scalable way is fundamental for a sustainable development of Artificial Intelligent systems. However, an agent-centric view of continual learning requires learning directly from raw data, which limits the interaction between independent agents, the efficiency, and the privacy of current approaches. Instead, we argue that continual learning systems should exploit the availability of compressed information in the form of trained models. In this paper, we introduce and formalize a new paradigm named "Ex-Model Continual Learning" (ExML), where an agent learns from a sequence of previously trained models instead of raw data. We further contribute with three ex-model continual learning algorithms and an empirical setting comprising three datasets (MNIST, CIFAR-10 and CORe50), and eight scenarios, where the proposed algorithms are extensively tested. Finally, we highlight the peculiarities of the ex-model paradigm and we point out interesting future research directions. 
Close
Arxiv
Close
 Caro, Valerio De;  Bano, Saira;  Machumilane, Achilles;  Gotta, Alberto;  Cassará, Pietro;  Carta, Antonio;  Sardianos, Christos;  Chronis, Christos;  Varlamis, Iraklis;  Tserpes, Konstantinos;  Lomonaco, Vincenzo;  Gallicchio, Claudio;  Bacciu, Davide
AI-as-a-Service Toolkit for Human-Centered Intelligence in Autonomous Driving Conference 
Proceedings of the 20th International Conference on Pervasive Computing and Communications (PerCom 2022), 2022.
Links | BibTeX
@conference{decaro2022aiasaservice,

title = {AI-as-a-Service Toolkit for Human-Centered Intelligence in Autonomous Driving},

author = {Valerio De Caro and Saira Bano and Achilles Machumilane and Alberto Gotta and Pietro Cassará and Antonio Carta and Christos Sardianos and Christos Chronis and Iraklis Varlamis and Konstantinos Tserpes and Vincenzo Lomonaco and Claudio Gallicchio and Davide Bacciu},

url = {https://arxiv.org/pdf/2202.01645.pdf, arxiv},

year  = {2022},

date = {2022-03-21},

urldate = {2022-03-21},

booktitle = {Proceedings of the 20th International Conference on Pervasive Computing and Communications (PerCom 2022)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
arxiv
Close
 Bacciu, Davide;  Lisboa, Paulo J. G.;  Vellido, Alfredo
Deep Learning in Biology and Medicine Book 
World Scientific Publisher, 2022, ISBN: 978-1-80061-093-4.
Abstract | Links | BibTeX
@book{BacciuBook2022,

title = {Deep Learning in Biology and Medicine},

author = {Davide Bacciu and Paulo J. G. Lisboa and Alfredo Vellido},

doi = {doi.org/10.1142/q0322 },

isbn = {978-1-80061-093-4},

year  = {2022},

date = {2022-02-01},

urldate = {2022-02-01},

publisher = {World Scientific Publisher},

abstract = {Biology, medicine and biochemistry have become data-centric fields for which Deep Learning methods are delivering groundbreaking results. Addressing high impact challenges, Deep Learning in Biology and Medicine provides an accessible and organic collection of Deep Learning essays on bioinformatics and medicine. It caters for a wide readership, ranging from machine learning practitioners and data scientists seeking methodological knowledge to address biomedical applications, to life science specialists in search of a gentle reference for advanced data analytics.

With contributions from internationally renowned experts, the book covers foundational methodologies in a wide spectrum of life sciences applications, including electronic health record processing, diagnostic imaging, text processing, as well as omics-data processing. This survey of consolidated problems is complemented by a selection of advanced applications, including cheminformatics and biomedical interaction network analysis. A modern and mindful approach to the use of data-driven methodologies in the life sciences also requires careful consideration of the associated societal, ethical, legal and transparency challenges, which are covered in the concluding chapters of this book.},

keywords = {},

pubstate = {published},

tppubtype = {book}

}

Close
Biology, medicine and biochemistry have become data-centric fields for which Deep Learning methods are delivering groundbreaking results. Addressing high impact challenges, Deep Learning in Biology and Medicine provides an accessible and organic collection of Deep Learning essays on bioinformatics and medicine. It caters for a wide readership, ranging from machine learning practitioners and data scientists seeking methodological knowledge to address biomedical applications, to life science specialists in search of a gentle reference for advanced data analytics.

With contributions from internationally renowned experts, the book covers foundational methodologies in a wide spectrum of life sciences applications, including electronic health record processing, diagnostic imaging, text processing, as well as omics-data processing. This survey of consolidated problems is complemented by a selection of advanced applications, including cheminformatics and biomedical interaction network analysis. A modern and mindful approach to the use of data-driven methodologies in the life sciences also requires careful consideration of the associated societal, ethical, legal and transparency challenges, which are covered in the concluding chapters of this book.
Close
doi:doi.org/10.1142/q0322 
Close
 Castellana, Daniele;  Bacciu, Davide
A Tensor Framework for Learning in Structured Domains Journal Article 
In: Neurocomputing, vol. 470, pp. 405-426, 2022.
Abstract | Links | BibTeX
@article{Castellana2021,

title = {A Tensor Framework for Learning in Structured Domains},

author = {Daniele Castellana and Davide Bacciu},

editor = {Kerstin Bunte and Niccolo Navarin and Luca Oneto},

doi = {10.1016/j.neucom.2021.05.110},

year  = {2022},

date = {2022-01-22},

urldate = {2022-01-22},

journal = {Neurocomputing},

volume = {470},

pages = {405-426},

abstract = {Learning machines for structured data (e.g., trees) are intrinsically based on their capacity to learn representations by aggregating information from the multi-way relationships emerging from the structure topology. While complex aggregation functions are desirable in this context to increase the expressiveness of the learned representations, the modelling of higher-order interactions among structure constituents is unfeasible, in practice, due to the exponential number of parameters required. Therefore, the common approach is to define models which rely only on first-order interactions among structure constituents.

In this work, we leverage tensors theory to define a framework for learning in structured domains. Such a framework is built on the observation that more expressive models require a tensor parameterisation. This observation is the stepping stone for the application of tensor decompositions in the context of recursive models. From this point of view, the advantage of using tensor decompositions is twofold since it allows limiting the number of model parameters while injecting inductive biases that do not ignore higher-order interactions.

We apply the proposed framework on probabilistic and neural models for structured data, defining different models which leverage tensor decompositions. The experimental validation clearly shows the advantage of these models compared to first-order and full-tensorial models.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Learning machines for structured data (e.g., trees) are intrinsically based on their capacity to learn representations by aggregating information from the multi-way relationships emerging from the structure topology. While complex aggregation functions are desirable in this context to increase the expressiveness of the learned representations, the modelling of higher-order interactions among structure constituents is unfeasible, in practice, due to the exponential number of parameters required. Therefore, the common approach is to define models which rely only on first-order interactions among structure constituents.

In this work, we leverage tensors theory to define a framework for learning in structured domains. Such a framework is built on the observation that more expressive models require a tensor parameterisation. This observation is the stepping stone for the application of tensor decompositions in the context of recursive models. From this point of view, the advantage of using tensor decompositions is twofold since it allows limiting the number of model parameters while injecting inductive biases that do not ignore higher-order interactions.

We apply the proposed framework on probabilistic and neural models for structured data, defining different models which leverage tensor decompositions. The experimental validation clearly shows the advantage of these models compared to first-order and full-tensorial models.
Close
doi:10.1016/j.neucom.2021.05.110
Close
 Cossu, Andrea;  Carta, Antonio;  Lomonaco, Vincenzo;  Bacciu, Davide
Continual Learning for Recurrent Neural Networks: an Empirical Evaluation Journal Article 
In: Neural Networks, vol. 143, pp. 607-627, 2021.
Abstract | Links | BibTeX
@article{Cossu2021b,

title = {Continual Learning for Recurrent Neural Networks: an Empirical Evaluation},

author = {Andrea Cossu and Antonio Carta and Vincenzo Lomonaco and Davide Bacciu},

url = {https://arxiv.org/abs/2103.07492, Arxiv},

year  = {2021},

date = {2021-12-03},

urldate = {2021-12-03},

journal = {Neural Networks},

volume = {143},

pages = {607-627},

abstract = {     Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario. },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
     Learning continuously during all model lifetime is fundamental to deploy machine learning solutions robust to drifts in the data distribution. Advances in Continual Learning (CL) with recurrent neural networks could pave the way to a large number of applications where incoming data is non stationary, like natural language processing and robotics. However, the existing body of work on the topic is still fragmented, with approaches which are application-specific and whose assessment is based on heterogeneous learning protocols and datasets. In this paper, we organize the literature on CL for sequential data processing by providing a categorization of the contributions and a review of the benchmarks. We propose two new benchmarks for CL with sequential data based on existing datasets, whose characteristics resemble real-world applications. We also provide a broad empirical evaluation of CL and Recurrent Neural Networks in class-incremental scenario, by testing their ability to mitigate forgetting with a number of different strategies which are not specific to sequential data processing. Our results highlight the key role played by the sequence length and the importance of a clear specification of the CL scenario. 
Close
Arxiv
Close
 Carta, Antonio;  Sperduti, Alessandro;  Bacciu, Davide
Encoding-based Memory for Recurrent Neural Networks Journal Article 
In: Neurocomputing, vol. 456, pp. 407-420, 2021.
Abstract | Links | BibTeX
@article{Carta2021b,

title = {Encoding-based Memory for Recurrent Neural Networks},

author = {Antonio Carta and Alessandro Sperduti and Davide Bacciu},

url = {https://arxiv.org/abs/2001.11771, Arxiv},

doi = {10.1016/j.neucom.2021.04.051},

year  = {2021},

date = {2021-10-07},

urldate = {2021-10-07},

journal = {Neurocomputing},

volume = {456},

pages = {407-420},

publisher = {Elsevier},

abstract = {Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences. We extend the memorization component with a modular memory that encodes the hidden state sequence at different sampling frequencies. Additionally, we provide a specialized training algorithm that initializes the memory to efficiently encode the hidden activations of the network. The experimental results on synthetic and real-world datasets show that specializing the training algorithm to train the memorization component always improves the final performance whenever the memorization of long sequences is necessary to solve the problem. },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences. We extend the memorization component with a modular memory that encodes the hidden state sequence at different sampling frequencies. Additionally, we provide a specialized training algorithm that initializes the memory to efficiently encode the hidden activations of the network. The experimental results on synthetic and real-world datasets show that specializing the training algorithm to train the memorization component always improves the final performance whenever the memorization of long sequences is necessary to solve the problem. 
Close
Arxiv
doi:10.1016/j.neucom.2021.04.051
Close
 Averta, Giuseppe;  Barontini, Federica;  Valdambrini, Irene;  Cheli, Paolo;  Bacciu, Davide;  Bianchi, Matteo
Learning to Prevent Grasp Failure with Soft Hands: From Online Prediction to Dual-Arm Grasp Recovery Journal Article 
In: Advanced Intelligent Systems, 2021.
Abstract | Links | BibTeX
@article{Averta2021,

title = {Learning to Prevent Grasp Failure with Soft Hands: From Online Prediction to Dual-Arm Grasp Recovery},

author = {Giuseppe Averta and Federica Barontini and Irene Valdambrini and Paolo Cheli and Davide Bacciu and Matteo Bianchi},

doi = {10.1002/aisy.202100146},

year  = {2021},

date = {2021-10-07},

urldate = {2021-10-07},

journal = {Advanced Intelligent Systems},

abstract = {Soft hands allow to simplify the grasp planning to achieve a successful grasp, thanks to their intrinsic adaptability. At the same time, their usage poses new challenges, related to the adoption of classical sensing techniques originally developed for rigid end defectors, which provide fundamental information, such as to detect object slippage. Under this regard, model-based approaches for the processing of the gathered information are hard to use, due to the difficulties in modeling hand–object interaction when softness is involved. To overcome these limitations, in this article, we proposed to combine distributed tactile sensing and machine learning (recurrent neural network) to detect sliding conditions for a soft robotic hand mounted on a robotic manipulator, targeting the prediction of the grasp failure event and the direction of sliding. The outcomes of these predictions allow for an online triggering of a compensatory action performed with a second robotic arm–hand system, to prevent the failure. Despite the fact that the network is trained only with spherical and cylindrical objects, we demonstrate high generalization capabilities of our framework, achieving a correct prediction of the failure direction in 75% of cases, and a 85% of successful regrasps, for a selection of 12 objects of common use.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Soft hands allow to simplify the grasp planning to achieve a successful grasp, thanks to their intrinsic adaptability. At the same time, their usage poses new challenges, related to the adoption of classical sensing techniques originally developed for rigid end defectors, which provide fundamental information, such as to detect object slippage. Under this regard, model-based approaches for the processing of the gathered information are hard to use, due to the difficulties in modeling hand–object interaction when softness is involved. To overcome these limitations, in this article, we proposed to combine distributed tactile sensing and machine learning (recurrent neural network) to detect sliding conditions for a soft robotic hand mounted on a robotic manipulator, targeting the prediction of the grasp failure event and the direction of sliding. The outcomes of these predictions allow for an online triggering of a compensatory action performed with a second robotic arm–hand system, to prevent the failure. Despite the fact that the network is trained only with spherical and cylindrical objects, we demonstrate high generalization capabilities of our framework, achieving a correct prediction of the failure direction in 75% of cases, and a 85% of successful regrasps, for a selection of 12 objects of common use.
Close
doi:10.1002/aisy.202100146
Close
 Bacciu, Davide;  Bianchi, Filippo Maria;  Paassen, Benjamin;  Alippi, Cesare
 Deep learning for graphs Conference 
Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021), 2021.
Abstract | Links | BibTeX
@conference{Bacciu2021c,

title = { Deep learning for graphs},

author = {Davide Bacciu and Filippo Maria Bianchi and Benjamin Paassen and Cesare Alippi},

editor = {Michel Verleysen},

doi = {10.14428/esann/2021.ES2021-5},

year  = {2021},

date = {2021-10-06},

urldate = {2021-10-06},

booktitle = {Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021)},

pages = {89-98},

abstract = { Deep learning for graphs encompasses all those models endowed with multiple layers of abstraction, which operate on data represented as graphs. The most common building blocks of these models are graph encoding layers, which compute a vector embedding for each node in a graph based on a sum of messages received from its neighbors. However, the family also includes architectures with decoders from vectors to graphs and models that process time-varying graphs and hypergraphs. In this paper, we provide an overview of the key concepts in the field, point towards open questions, and frame the contributions of the ESANN 2021 special session into the broader context of deep learning for graphs. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
 Deep learning for graphs encompasses all those models endowed with multiple layers of abstraction, which operate on data represented as graphs. The most common building blocks of these models are graph encoding layers, which compute a vector embedding for each node in a graph based on a sum of messages received from its neighbors. However, the family also includes architectures with decoders from vectors to graphs and models that process time-varying graphs and hypergraphs. In this paper, we provide an overview of the key concepts in the field, point towards open questions, and frame the contributions of the ESANN 2021 special session into the broader context of deep learning for graphs. 
Close
doi:10.14428/esann/2021.ES2021-5
Close
 Valenti, Andrea;  Berti, Stefano;  Bacciu, Davide
Calliope - A Polyphonic Music Transformer Conference 
Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021), 2021.
Links | BibTeX
@conference{Valenti2021b,

title = {Calliope - A Polyphonic Music Transformer},

author = {Andrea Valenti and Stefano Berti and Davide Bacciu},

editor = {Michel Verleysen},

url = { The polyphonic nature of music makes the application of deep learning to music modelling a challenging task. On the other hand, the Transformer architecture seems to be a good fit for this kind of data. In this work, we present Calliope, a novel autoencoder model based on Transformers for the efficient modelling of multi-track sequences of polyphonic music. The experiments show that our model is able to improve the state of the art on musical sequence reconstruction and generation, with remarkably good results especially on long sequences. },

doi = {10.14428/esann/2021.ES2021-63},

year  = {2021},

date = {2021-10-06},

urldate = {2021-10-06},

booktitle = {Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021)},

pages = {405-410},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
the Transformer architecture seems to be a good fit for this kind of data. In th[...]
doi:10.14428/esann/2021.ES2021-63
Close
 Bacciu, Davide;  Conte, Alessio;  Grossi, Roberto;  Landolfi, Francesco;  Marino, Andrea
K-Plex Cover Pooling for Graph Neural Networks Journal Article 
In: Data Mining and Knowledge Discovery, 2021, (Accepted also as paper to the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2021)).
Abstract | Links | BibTeX
@article{Bacciu2021b,

title = {K-Plex Cover Pooling for Graph Neural Networks},

author = {Davide Bacciu and Alessio Conte and Roberto Grossi and Francesco Landolfi and Andrea Marino},

editor = {Annalisa Appice and Sergio Escalera and José A. Gámez and Heike Trautmann},

url = {https://link.springer.com/article/10.1007/s10618-021-00779-z, Published version},

doi = {10.1007/s10618-021-00779-z},

year  = {2021},

date = {2021-09-13},

urldate = {2021-09-13},

journal = {Data Mining and Knowledge Discovery},

abstract = {raph pooling methods provide mechanisms for structure reduction that are intended to ease the diffusion of context between nodes further in the graph, and that typically leverage community discovery mechanisms or node and edge pruning heuristics. In this paper, we introduce a novel pooling technique which borrows from classical results in graph theory that is non-parametric and generalizes well to graphs of different nature and connectivity patterns. Our pooling method, named KPlexPool, builds on the concepts of graph covers and k-plexes, i.e. pseudo-cliques where each node can miss up to k links. The experimental evaluation on benchmarks on molecular and social graph classification shows that KPlexPool achieves state of the art performances against both parametric and non-parametric pooling methods in the literature, despite generating pooled graphs based solely on topological information.},

note = {Accepted also as paper to the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD 2021)},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
raph pooling methods provide mechanisms for structure reduction that are intended to ease the diffusion of context between nodes further in the graph, and that typically leverage community discovery mechanisms or node and edge pruning heuristics. In this paper, we introduce a novel pooling technique which borrows from classical results in graph theory that is non-parametric and generalizes well to graphs of different nature and connectivity patterns. Our pooling method, named KPlexPool, builds on the concepts of graph covers and k-plexes, i.e. pseudo-cliques where each node can miss up to k links. The experimental evaluation on benchmarks on molecular and social graph classification shows that KPlexPool achieves state of the art performances against both parametric and non-parametric pooling methods in the literature, despite generating pooled graphs based solely on topological information.
Close
Published version
doi:10.1007/s10618-021-00779-z
Close
 Rosasco, Andrea;  Carta, Antonio;  Cossu, Andrea;  Lomonaco, Vincenzo;  Bacciu, Davide
Distilled Replay: Overcoming Forgetting through Synthetic Samples Workshop 
IJCAI 2021 workshop on continual semi-supervised learning (CSSL 2021) , 2021.
Abstract | Links | BibTeX
@workshop{Rosasco2021,

title = {Distilled Replay: Overcoming Forgetting through Synthetic Samples},

author = {Andrea Rosasco and Antonio Carta and Andrea Cossu and Vincenzo Lomonaco and Davide Bacciu},

url = {https://arxiv.org/abs/2103.15851, Arxiv},

year  = {2021},

date = {2021-08-19},

urldate = {2021-08-19},

booktitle = {IJCAI 2021 workshop on continual semi-supervised learning (CSSL 2021) },

abstract = {Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experience, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer (up to pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against naive replay, which randomly samples patterns from the dataset, on four popular Continual Learning benchmarks.},

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
Replay strategies are Continual Learning techniques which mitigate catastrophic forgetting by keeping a buffer of patterns from previous experience, which are interleaved with new data during training. The amount of patterns stored in the buffer is a critical parameter which largely influences the final performance and the memory footprint of the approach. This work introduces Distilled Replay, a novel replay strategy for Continual Learning which is able to mitigate forgetting by keeping a very small buffer (up to pattern per class) of highly informative samples. Distilled Replay builds the buffer through a distillation process which compresses a large dataset into a tiny set of informative examples. We show the effectiveness of our Distilled Replay against naive replay, which randomly samples patterns from the dataset, on four popular Continual Learning benchmarks.
Close
Arxiv
Close
 Bacciu, Davide;  Podda, Marco
GraphGen-Redux: a Fast and Lightweight Recurrent Model for Labeled Graph Generation Conference 
Proceedings of the International Joint Conference on Neural Networks (IJCNN 2021), IEEE 2021.
Abstract | Links | BibTeX
@conference{BacciuPoddaIJCNN2021,

title = {GraphGen-Redux: a Fast and Lightweight Recurrent Model for Labeled Graph Generation},

author = {Davide Bacciu and Marco Podda},

doi = {10.1109/IJCNN52387.2021.9533743},

year  = {2021},

date = {2021-07-18},

urldate = {2021-07-18},

booktitle = {Proceedings of the International Joint Conference on Neural Networks (IJCNN 2021)},

organization = {IEEE},

abstract = {The problem of labeled graph generation is gaining attention in the Deep Learning community. The task is challenging due to the sparse and discrete nature of graph spaces. Several approaches have been proposed in the literature, most of which require to transform the graphs into sequences that encode their structure and labels and to learn the distribution of such sequences through an auto-regressive generative model. Among this family of approaches, we focus on the Graphgen model. The preprocessing phase of Graphgen transforms graphs into unique edge sequences called Depth-First Search (DFS) codes, such that two isomorphic graphs are assigned the same DFS code. Each element of a DFS code is associated with a graph edge: specifically, it is a quintuple comprising one node identifier for each of the two endpoints, their node labels, and the edge label. Graphgen learns to generate such sequences auto-regressively and models the probability of each component of the quintuple independently. While effective, the independence assumption made by the model is too loose to capture the complex label dependencies of real-world graphs precisely. By introducing a novel graph preprocessing approach, we are able to process the labeling information of both nodes and edges jointly. The corresponding model, which we term Graphgen-redux, improves upon the generative performances of Graphgen in a wide range of datasets of chemical and social graphs. In addition, it uses approximately 78% fewer parameters than the vanilla variant and requires 50% fewer epochs of training on average.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The problem of labeled graph generation is gaining attention in the Deep Learning community. The task is challenging due to the sparse and discrete nature of graph spaces. Several approaches have been proposed in the literature, most of which require to transform the graphs into sequences that encode their structure and labels and to learn the distribution of such sequences through an auto-regressive generative model. Among this family of approaches, we focus on the Graphgen model. The preprocessing phase of Graphgen transforms graphs into unique edge sequences called Depth-First Search (DFS) codes, such that two isomorphic graphs are assigned the same DFS code. Each element of a DFS code is associated with a graph edge: specifically, it is a quintuple comprising one node identifier for each of the two endpoints, their node labels, and the edge label. Graphgen learns to generate such sequences auto-regressively and models the probability of each component of the quintuple independently. While effective, the independence assumption made by the model is too loose to capture the complex label dependencies of real-world graphs precisely. By introducing a novel graph preprocessing approach, we are able to process the labeling information of both nodes and edges jointly. The corresponding model, which we term Graphgen-redux, improves upon the generative performances of Graphgen in a wide range of datasets of chemical and social graphs. In addition, it uses approximately 78% fewer parameters than the vanilla variant and requires 50% fewer epochs of training on average.
Close
doi:10.1109/IJCNN52387.2021.9533743
Close
 Lomonaco, Vincenzo;  Pellegrini, Lorenzo;  Cossu, Andrea;  Carta, Antonio;  Graffieti, Gabriele;  Hayes, Tyler L;  Lange, Matthias De;  Masana, Marc;  Pomponi, Jary; van de Ven, Gido;  Mundt, Martin;  She, Qi;  Cooper, Keiland;  Forest, Jeremy;  Belouadah, Eden;  Calderara, Simone;  Parisi, German I;  Cuzzolin, Fabio;  Tolias, Andreas;  Scardapane, Simone;  Antiga, Luca;  Amhad, Subutai;  Popescu, Adrian;  Kanan, Christopher; van de Weijer, Joost;  Tuytelaars, Tinne;  Bacciu, Davide;  Maltoni, Davide
Avalanche: an End-to-End Library for Continual Learning Workshop 
Proceedings of the CVPR 2021 Workshop on Continual Learning , IEEE, 2021.
Links | BibTeX
@workshop{lomonaco2021avalanche,

title = {Avalanche: an End-to-End Library for Continual Learning},

author = {Vincenzo Lomonaco and Lorenzo Pellegrini and Andrea Cossu and Antonio Carta and Gabriele Graffieti and Tyler L Hayes and Matthias De Lange and Marc Masana and Jary Pomponi and Gido van de Ven and Martin Mundt and Qi She and Keiland Cooper and Jeremy Forest and Eden Belouadah and Simone Calderara and German I Parisi and Fabio Cuzzolin and Andreas Tolias and Simone Scardapane and Luca Antiga and Subutai Amhad and Adrian Popescu and Christopher Kanan and Joost van de Weijer and Tinne Tuytelaars and Davide Bacciu and Davide Maltoni},

url = {https://arxiv.org/abs/2104.00405, Arxiv},

year  = {2021},

date = {2021-06-19},

urldate = {2021-06-19},

booktitle = {Proceedings of the CVPR 2021 Workshop on Continual Learning },

pages = {3600-3610},

publisher = {IEEE},

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
Arxiv
Close
 Ferrari, Elisa;  Bacciu, Davide
Addressing Fairness, Bias and Class Imbalance in Machine Learning: the FBI-loss Unpublished 
Online on Arxiv, 2021.
Abstract | Links | BibTeX
@unpublished{Ferrari2021,

title = {Addressing Fairness, Bias and Class Imbalance in Machine Learning: the FBI-loss},

author = {Elisa Ferrari and Davide Bacciu},

url = {https://arxiv.org/abs/2105.06345, Arxiv},

year  = {2021},

date = {2021-05-13},

urldate = {2021-05-13},

abstract = {Resilience to class imbalance and confounding biases, together with the assurance of fairness guarantees are highly desirable properties of autonomous decision-making systems with real-life impact. Many different targeted solutions have been proposed to address separately these three problems, however a unifying perspective seems to be missing. With this work, we provide a general formalization, showing that they are different expressions of unbalance. Following this intuition, we formulate a unified loss correction to address issues related to Fairness, Biases and Imbalances (FBI-loss). The correction capabilities of the proposed approach are assessed on three real-world benchmarks, each associated to one of the issues under consideration, and on a family of synthetic data in order to better investigate the effectiveness of our loss on tasks with different complexities. The empirical results highlight that the flexible formulation of the FBI-loss leads also to competitive performances with respect to literature solutions specialised for the single problems.},

howpublished = {Online on Arxiv},

keywords = {},

pubstate = {published},

tppubtype = {unpublished}

}

Close
Resilience to class imbalance and confounding biases, together with the assurance of fairness guarantees are highly desirable properties of autonomous decision-making systems with real-life impact. Many different targeted solutions have been proposed to address separately these three problems, however a unifying perspective seems to be missing. With this work, we provide a general formalization, showing that they are different expressions of unbalance. Following this intuition, we formulate a unified loss correction to address issues related to Fairness, Biases and Imbalances (FBI-loss). The correction capabilities of the proposed approach are assessed on three real-world benchmarks, each associated to one of the issues under consideration, and on a family of synthetic data in order to better investigate the effectiveness of our loss on tasks with different complexities. The empirical results highlight that the flexible formulation of the FBI-loss leads also to competitive performances with respect to literature solutions specialised for the single problems.
Close
Arxiv
Close
 Errica, Federico;  Giulini, Marco;  Bacciu, Davide;  Menichetti, Roberto;  Micheli, Alessio;  Potestio, Raffaello
A deep graph network-enhanced sampling approach to efficiently explore the space of reduced representations of proteins Journal Article 
In: Frontiers in Molecular Biosciences, vol. 8, pp. 136, 2021.
Links | BibTeX
@article{errica_deep_2021,

title = {A deep graph network-enhanced sampling approach to efficiently explore the space of reduced representations of proteins},

author = {Federico Errica and Marco Giulini and Davide Bacciu and Roberto Menichetti and Alessio Micheli and Raffaello Potestio},

doi = {10.3389/fmolb.2021.637396},

year  = {2021},

date = {2021-02-28},

urldate = {2021-02-28},

journal = {Frontiers in Molecular Biosciences},

volume = {8},

pages = {136},

publisher = {Frontiers},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
doi:10.3389/fmolb.2021.637396
Close
Michele Barsotti Andrea Valenti, Davide Bacciu;  Ascari, Luca
A Deep Classifier for Upper-Limbs Motor Anticipation Tasks in an Online BCI Setting Journal Article 
In: Bioengineering , 2021.
Links | BibTeX
@article{Valenti2021,

title = {A Deep Classifier for Upper-Limbs Motor Anticipation Tasks in an Online BCI Setting},

author = {Andrea Valenti, Michele Barsotti, Davide Bacciu and Luca Ascari

},

url = {https://www.mdpi.com/2306-5354/8/2/21, Open Access },

doi = {10.3390/bioengineering8020021},

year  = {2021},

date = {2021-02-05},

urldate = {2021-02-05},

journal = {Bioengineering },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Open Access 
doi:10.3390/bioengineering8020021
Close
 Crecchi, Francesco;  Melis, Marco;  Sotgiu, Angelo;  Bacciu, Davide;  Biggio, Battista
FADER: Fast Adversarial Example Rejection Journal Article 
In: Neurocomputing, 2021, ISSN: 0925-2312.
Links | BibTeX
@article{CRECCHI2021,

title = {FADER: Fast Adversarial Example Rejection},

author = {Francesco Crecchi and Marco Melis and Angelo Sotgiu and Davide Bacciu and Battista Biggio},

url = {https://arxiv.org/abs/2010.09119, Arxiv},

doi = {https://doi.org/10.1016/j.neucom.2021.10.082},

issn = {0925-2312},

year  = {2021},

date = {2021-01-01},

urldate = {2021-01-01},

journal = {Neurocomputing},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Arxiv
doi:https://doi.org/10.1016/j.neucom.2021.10.082
Close
 Ronchetti, Matteo;  Bacciu, Davide
Generative Tomography Reconstruction Workshop 
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Deep Learning and Inverse Problems, 2020.
Abstract | Links | BibTeX
@workshop{tomographyNeurips2020,

title = {Generative Tomography Reconstruction},

author = {Matteo Ronchetti and Davide Bacciu},

url = {https://arxiv.org/pdf/2010.14933.pdf, PDF},

year  = {2020},

date = {2020-12-11},

urldate = {2020-12-11},

booktitle = {34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Deep Learning and Inverse Problems},

abstract = {We propose an end-to-end differentiable architecture for tomography reconstruc-1tion that directly maps a noisy sinogram into a denoised reconstruction. Compared2to existing approaches our end-to-end architecture produces more accurate recon-3structions while using less parameters and time.  We also propose a generative4model that, given a noisy sinogram, can sample realistic reconstructions.  This5generative model can be used as prior inside an iterative process that,  by tak-6ing into consideration the physical model, can reduce artifacts and errors in the7reconstructions.},

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
We propose an end-to-end differentiable architecture for tomography reconstruc-1tion that directly maps a noisy sinogram into a denoised reconstruction. Compared2to existing approaches our end-to-end architecture produces more accurate recon-3structions while using less parameters and time.  We also propose a generative4model that, given a noisy sinogram, can sample realistic reconstructions.  This5generative model can be used as prior inside an iterative process that,  by tak-6ing into consideration the physical model, can reduce artifacts and errors in the7reconstructions.
Close
PDF
Close
 Bacciu, Davide;  Conte, Alessio;  Grossi, Roberto;  Landolfi, Francesco;  Marino, Andrea
K-plex Cover Pooling for Graph Neural Networks Workshop 
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Learning Meets Combinatorial Algorithms, 2020.
Abstract | BibTeX
@workshop{kplexWS2020,

title = {K-plex Cover Pooling for Graph Neural Networks},

author = {Davide Bacciu and Alessio Conte and Roberto Grossi and Francesco Landolfi and Andrea Marino},

year  = {2020},

date = {2020-12-11},

urldate = {2020-12-11},

booktitle = {34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Learning Meets Combinatorial Algorithms},

abstract = {We introduce a novel pooling technique which borrows from classical results in graph theory that is non-parametric and generalizes well to graphs of different nature and connectivity pattern. Our pooling method, named KPlexPool, builds on the concepts of graph covers and $k$-plexes, i.e. pseudo-cliques where each node can miss up to $k$ links. The experimental evaluation on molecular and social graph classification shows that KPlexPool achieves state of the art performances, supporting the intuition that well-founded graph-theoretic approaches can be effectively integrated in learning models for graphs. },

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
We introduce a novel pooling technique which borrows from classical results in graph theory that is non-parametric and generalizes well to graphs of different nature and connectivity pattern. Our pooling method, named KPlexPool, builds on the concepts of graph covers and $k$-plexes, i.e. pseudo-cliques where each node can miss up to $k$ links. The experimental evaluation on molecular and social graph classification shows that KPlexPool achieves state of the art performances, supporting the intuition that well-founded graph-theoretic approaches can be effectively integrated in learning models for graphs. 
Close
 Carta, Antonio;  Sperduti, Alessandro;  Bacciu, Davide
 Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization  Workshop 
34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Beyond BackPropagation: Novel Ideas for Training Neural Architectures, 2020.
Abstract | Links | BibTeX
@workshop{CartaNeuripsWS2020,

title = { Short-Term Memory Optimization in Recurrent Neural Networks by Autoencoder-based Initialization },

author = {Antonio Carta and Alessandro Sperduti and Davide Bacciu

},

url = {https://arxiv.org/abs/2011.02886, Arxiv},

year  = {2020},

date = {2020-12-11},

urldate = {2020-12-11},

booktitle = {34th Conference on Neural Information Processing Systems (NeurIPS 2020), Workshop on Beyond BackPropagation: Novel Ideas for Training Neural Architectures},

abstract = {Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory and that can be solved with a closed-form solution without backpropagation. We introduce an initialization schema that pretrains the weights of a recurrent neural network to approximate the linear autoencoder of the input sequences and we show how such pretraining can better support solving hard classification tasks with long sequences. We test our approach on sequential and permuted MNIST. We show that the proposed approach achieves a much lower reconstruction error for long sequences and a better gradient propagation during the finetuning phase. },

keywords = {},

pubstate = {published},

tppubtype = {workshop}

}

Close
Training RNNs to learn long-term dependencies is difficult due to vanishing gradients. We explore an alternative solution based on explicit memorization using linear autoencoders for sequences, which allows to maximize the short-term memory and that can be solved with a closed-form solution without backpropagation. We introduce an initialization schema that pretrains the weights of a recurrent neural network to approximate the linear autoencoder of the input sequences and we show how such pretraining can better support solving hard classification tasks with long sequences. We test our approach on sequential and permuted MNIST. We show that the proposed approach achieves a much lower reconstruction error for long sequences and a better gradient propagation during the finetuning phase. 
Close
Arxiv
Close
 Valenti, Andrea;  Barsotti, Michele;  Brondi, Raffaello;  Bacciu, Davide;  Ascari, Luca
ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs Conference 
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2020.
Abstract | Links | BibTeX
@conference{smc2020,

title = {ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs},

author = {Andrea Valenti and Michele Barsotti and Raffaello Brondi and Davide Bacciu and Luca Ascari},

url = {https://arxiv.org/abs/2008.13485, Arxiv},

year  = {2020},

date = {2020-10-11},

urldate = {2020-10-11},

booktitle = {Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)},

publisher = {IEEE},

abstract = {     Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
     Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework. 
Close
Arxiv
Close
 Bacciu, Davide;  Errica, Federico;  Micheli, Alessio;  Podda, Marco
A Gentle Introduction to Deep Learning for Graphs Journal Article 
In: Neural Networks, vol. 129, pp. 203-221, 2020.
Abstract | Links | BibTeX
@article{gentleGraphs2020,

title = {A Gentle Introduction to Deep Learning for Graphs},

author = {Davide Bacciu and Federico Errica and Alessio Micheli and Marco Podda},

url = {https://arxiv.org/abs/1912.12693, Arxiv

https://doi.org/10.1016/j.neunet.2020.06.006, Original Paper},

doi = {10.1016/j.neunet.2020.06.006},

year  = {2020},

date = {2020-09-01},

urldate = {2020-09-01},

journal = {Neural Networks},

volume = {129},

pages = {203-221},

publisher = {Elsevier},

abstract = {The adaptive processing of graph data is a long-standing research topic which has been lately consolidated as a theme of major interest in the deep learning community. The snap increase in the amount and breadth of related research has come at the price of little systematization of knowledge and attention to earlier literature. This work is designed as a tutorial introduction to the field of deep learning for graphs. It favours a consistent and progressive introduction of the main concepts and architectural aspects over an exposition of the most recent literature, for which the reader is referred to available surveys. The paper takes a top-down view to the problem, introducing a generalized formulation of graph representation learning based on a local and iterative approach to structured information processing. It introduces the basic building blocks that can be combined to design novel and effective neural models for graphs. The methodological exposition is complemented by a discussion of interesting research challenges and applications in the field. },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
The adaptive processing of graph data is a long-standing research topic which has been lately consolidated as a theme of major interest in the deep learning community. The snap increase in the amount and breadth of related research has come at the price of little systematization of knowledge and attention to earlier literature. This work is designed as a tutorial introduction to the field of deep learning for graphs. It favours a consistent and progressive introduction of the main concepts and architectural aspects over an exposition of the most recent literature, for which the reader is referred to available surveys. The paper takes a top-down view to the problem, introducing a generalized formulation of graph representation learning based on a local and iterative approach to structured information processing. It introduces the basic building blocks that can be combined to design novel and effective neural models for graphs. The methodological exposition is complemented by a discussion of interesting research challenges and applications in the field. 
Close
Arxiv
Original Paper
doi:10.1016/j.neunet.2020.06.006
Close
 Bacciu, Davide;  Errica, Federico;  Micheli, Alessio
Probabilistic Learning on Graphs via Contextual Architectures Journal Article 
In: Journal of Machine Learning Research, vol. 21, no. 134, pp. 1−39, 2020.
Abstract | Links | BibTeX
@article{jmlrCGMM20,

title = {Probabilistic Learning on Graphs via Contextual Architectures},

author = {Davide Bacciu and Federico Errica and Alessio Micheli},

editor = {Pushmeet Kohli},

url = {http://jmlr.org/papers/v21/19-470.html, Paper},

year  = {2020},

date = {2020-07-27},

urldate = {2020-07-27},

journal = {Journal of Machine Learning Research},

volume = {21},

number = {134},

pages = {1−39},

abstract = {We propose a novel methodology for representation learning on graph-structured data, in which a stack of Bayesian Networks learns different distributions of a vertex's neighborhood. Through an incremental construction policy and layer-wise training, we can build deeper architectures with respect to typical graph convolutional neural networks, with benefits in terms of context spreading between vertices. 

First, the model learns from graphs via maximum likelihood estimation without using target labels.

Then, a supervised readout is applied to the learned graph embeddings to deal with graph classification and vertex classification tasks, showing competitive results against neural models for graphs. The computational complexity is linear in the number of edges, facilitating learning on large scale data sets. By studying how depth affects the performances of our model, we discover that a broader context generally improves performances. In turn, this leads to a critical analysis of some benchmarks used in literature.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
We propose a novel methodology for representation learning on graph-structured data, in which a stack of Bayesian Networks learns different distributions of a vertex's neighborhood. Through an incremental construction policy and layer-wise training, we can build deeper architectures with respect to typical graph convolutional neural networks, with benefits in terms of context spreading between vertices. 

First, the model learns from graphs via maximum likelihood estimation without using target labels.

Then, a supervised readout is applied to the learned graph embeddings to deal with graph classification and vertex classification tasks, showing competitive results against neural models for graphs. The computational complexity is linear in the number of edges, facilitating learning on large scale data sets. By studying how depth affects the performances of our model, we discover that a broader context generally improves performances. In turn, this leads to a critical analysis of some benchmarks used in literature.
Close
Paper
Close
 Castellana, Daniele;  Bacciu, Davide
Generalising Recursive Neural Models by Tensor Decomposition Conference 
Proceedings  of the 2020 IEEE World Congress on Computational Intelligence, 2020.
Links | BibTeX
@conference{Wcci20Tensor,

title = {Generalising Recursive Neural Models by Tensor Decomposition},

author = {Daniele Castellana and Davide Bacciu},

url = {https://arxiv.org/abs/2006.10021, Arxiv},

year  = {2020},

date = {2020-07-19},

urldate = {2020-07-19},

booktitle = {Proceedings  of the 2020 IEEE World Congress on Computational Intelligence},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Arxiv
Close
 Cossu, Andrea;  Carta, Antonio;  Bacciu, Davide
Continual Learning with Gated Incremental Memories for Sequential Data Processing Conference 
Proceedings  of the 2020 IEEE World Congress on Computational Intelligence, 2020.
Links | BibTeX
@conference{Wcci20CL,

title = {Continual Learning with Gated Incremental Memories for Sequential Data Processing},

author = {Andrea Cossu and Antonio Carta and Davide Bacciu},

url = {https://arxiv.org/pdf/2004.04077.pdf, Arxiv},

doi = {10.1109/IJCNN48605.2020.9207550},

year  = {2020},

date = {2020-07-19},

urldate = {2020-07-19},

booktitle = {Proceedings  of the 2020 IEEE World Congress on Computational Intelligence},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Arxiv
doi:10.1109/IJCNN48605.2020.9207550
Close
 Valenti, Andrea;  Carta, Antonio;  Bacciu, Davide
 Learning a Latent Space of Style-Aware Music Representations by Adversarial Autoencoders Conference 
Proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020), 2020.
Links | BibTeX
@conference{ecai2020,

title = { Learning a Latent Space of Style-Aware Music Representations by Adversarial Autoencoders},

author = {Andrea Valenti and Antonio Carta and Davide Bacciu},

url = {https://arxiv.org/abs/2001.05494},

year  = {2020},

date = {2020-06-08},

booktitle = {Proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/2001.05494
Close
 Carta, Antonio;  Sperduti, Alessandro;  Bacciu, Davide
Incremental training of a recurrent neural network exploiting a multi-scale dynamic memory Conference 
Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020 (ECML-PKDD 2020), Springer International Publishing, 2020.
Abstract | BibTeX
@conference{ecml2020LMN,

title = {Incremental training of a recurrent neural network exploiting a multi-scale dynamic memory},

author = {Antonio Carta and Alessandro Sperduti and Davide Bacciu},

year  = {2020},

date = {2020-06-05},

urldate = {2020-06-05},

booktitle = {Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020 (ECML-PKDD 2020)},

publisher = {Springer International Publishing},

abstract = {The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of hidden activations. Experimental results on synthetic and real-world datasets on speech recognition and handwritten characters show that the modular architecture and the incremental training algorithm improve the ability of recurrent neural networks to capture long-term dependencies.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of hidden activations. Experimental results on synthetic and real-world datasets on speech recognition and handwritten characters show that the modular architecture and the incremental training algorithm improve the ability of recurrent neural networks to capture long-term dependencies.
Close
 Errica, Federico;  Podda, Marco;  Bacciu, Davide;  Micheli, Alessio
A Fair Comparison of Graph Neural Networks for Graph Classification Conference 
Proceedings of the Eighth International Conference on Learning Representations (ICLR 2020), 2020.
Abstract | Links | BibTeX
@conference{iclr19,

title = {A Fair Comparison of Graph Neural Networks for Graph Classification},

author = {Federico Errica and Marco Podda and Davide Bacciu and Alessio Micheli},

url = {https://openreview.net/pdf?id=HygDF6NFPB, PDF

https://iclr.cc/virtual_2020/poster_HygDF6NFPB.html, Talk

https://github.com/diningphil/gnn-comparison, Code},

year  = {2020},

date = {2020-04-30},

booktitle = {Proceedings of the Eighth International Conference on Learning Representations (ICLR 2020)},

abstract = {Experimental reproducibility and replicability are critical topics in machine learning. Authors have often raised concerns about their lack in scientific publications to improve the quality of the field. Recently, the graph representation learning field has attracted the attention of a wide research community, which resulted in a large stream of works.

As such, several Graph Neural Network models have been developed to effectively tackle graph classification. However, experimental procedures often lack rigorousness and are hardly reproducible. Motivated by this, we provide an overview of common practices that should be avoided to fairly compare with the state of the art. To counter this troubling trend, we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks. Moreover, by comparing GNNs with structure-agnostic baselines we provide convincing evidence that, on some datasets, structural information has not been exploited yet. We believe that this work can contribute to the development of the graph learning field, by providing a much needed grounding for rigorous evaluations of graph classification models.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Experimental reproducibility and replicability are critical topics in machine learning. Authors have often raised concerns about their lack in scientific publications to improve the quality of the field. Recently, the graph representation learning field has attracted the attention of a wide research community, which resulted in a large stream of works.

As such, several Graph Neural Network models have been developed to effectively tackle graph classification. However, experimental procedures often lack rigorousness and are hardly reproducible. Motivated by this, we provide an overview of common practices that should be avoided to fairly compare with the state of the art. To counter this troubling trend, we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks. Moreover, by comparing GNNs with structure-agnostic baselines we provide convincing evidence that, on some datasets, structural information has not been exploited yet. We believe that this work can contribute to the development of the graph learning field, by providing a much needed grounding for rigorous evaluations of graph classification models.
Close
PDF
Talk
Code
Close
 Bacciu, Davide;  Mandic, Danilo
Tensor Decompositions in Deep Learning Conference 
Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'20), 2020.
Links | BibTeX
@conference{esann20Tutorial,

title = {Tensor Decompositions in Deep Learning},

author = {Davide Bacciu and Danilo Mandic},

editor = {Michel Verleysen},

url = {https://arxiv.org/abs/2002.11835},

year  = {2020},

date = {2020-04-21},

booktitle = {Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'20)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/2002.11835
Close
 Castellana, Daniele;  Bacciu, Davide
 Tensor Decompositions in Recursive Neural Networks for Tree-Structured Data  Conference 
Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'20), 2020.
Links | BibTeX
@conference{esann20Castellana,

title = { Tensor Decompositions in Recursive Neural Networks for Tree-Structured Data },

author = {Daniele Castellana and Davide Bacciu},

editor = {Michel Verleysen},

url = {https://arxiv.org/pdf/2006.10619.pdf, Arxiv},

year  = {2020},

date = {2020-04-21},

booktitle = {Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'20)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Arxiv
Close
 Bacciu, Davide;  Carta, Antonio
Sequential Sentence Embeddings for Semantic Similarity Conference 
Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI'19), IEEE, 2019.
Abstract | Links | BibTeX
@conference{ssci19,

title = {Sequential Sentence Embeddings for Semantic Similarity},

author = {Davide Bacciu and Antonio Carta},

doi = {10.1109/SSCI44817.2019.9002824},

year  = {2019},

date = {2019-12-06},

urldate = {2019-12-06},

booktitle = {Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI'19)},

publisher = {IEEE},

abstract = {  Sentence embeddings are distributed representations of sentences intended to be general features to be effectively used as input for deep learning models across different natural language processing tasks.

  State-of-the-art sentence embeddings for semantic similarity are computed with a weighted average of pretrained word embeddings, hence completely ignoring the contribution of word ordering within a sentence in defining its semantics. We propose a novel approach to compute sentence embeddings for semantic similarity that exploits a linear autoencoder for sequences. The method can be trained in closed form and it is easy to fit on unlabeled sentences. Our method provides a grounded approach to identify and subtract common discourse from a sentence and its embedding, to remove associated uninformative features. Unlike similar methods in the literature (e.g. the popular Smooth Inverse Frequency approach), our method is able to account for word order. We show that our estimate of the common discourse vector improves the results on two different semantic similarity benchmarks when compared to related approaches from the literature.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
  Sentence embeddings are distributed representations of sentences intended to be general features to be effectively used as input for deep learning models across different natural language processing tasks.

  State-of-the-art sentence embeddings for semantic similarity are computed with a weighted average of pretrained word embeddings, hence completely ignoring the contribution of word ordering within a sentence in defining its semantics. We propose a novel approach to compute sentence embeddings for semantic similarity that exploits a linear autoencoder for sequences. The method can be trained in closed form and it is easy to fit on unlabeled sentences. Our method provides a grounded approach to identify and subtract common discourse from a sentence and its embedding, to remove associated uninformative features. Unlike similar methods in the literature (e.g. the popular Smooth Inverse Frequency approach), our method is able to account for word order. We show that our estimate of the common discourse vector improves the results on two different semantic similarity benchmarks when compared to related approaches from the literature.
Close
doi:10.1109/SSCI44817.2019.9002824
Close
 Bacciu, Davide;  Sotto, Luigi Di
A non-negative factorization approach to node pooling in graph convolutional neural networks Conference 
Proceedings of the 18th International Conference of the Italian Association for Artificial Intelligence (AIIA 2019), Lecture Notes in Artificial Intelligence Springer-Verlag, 2019.
Links | BibTeX
@conference{aiia2019,

title = {A non-negative factorization approach to node pooling in graph convolutional neural networks},

author = {Davide Bacciu and Luigi {Di Sotto}},

url = {https://arxiv.org/pdf/1909.03287.pdf},

year  = {2019},

date = {2019-11-22},

booktitle = {Proceedings of the 18th International Conference of the Italian Association for Artificial Intelligence (AIIA 2019)},

publisher = {Springer-Verlag},

series = {Lecture Notes in Artificial Intelligence},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/pdf/1909.03287.pdf
Close
 Bacciu, Davide;  Carta, Antonio;  Sperduti, Alessandro
Linear Memory Networks Conference 
Proceedings of the 28th International Conference on Artificial Neural Networks (ICANN 2019), , vol. 11727, Lecture Notes in Computer Science Springer-Verlag, 2019.
Abstract | Links | BibTeX
@conference{lmnArx18,

title = {Linear Memory Networks},

author = {Davide Bacciu and Antonio Carta and Alessandro Sperduti},

url = {https://arxiv.org/pdf/1811.03356.pdf},

doi = {10.1007/978-3-030-30487-4_40},

year  = {2019},

date = {2019-09-17},

urldate = {2019-09-17},

booktitle = {Proceedings of the 28th International Conference on Artificial Neural Networks (ICANN 2019), },

volume = {11727},

pages = {513-525 },

publisher = {Springer-Verlag},

series = {Lecture Notes in Computer Science},

abstract = {Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, showing how they can be implemented through different neural components. By building on such conceptualization, we introduce the Linear Memory Network, a recurrent model comprising a feedforward neural network, realizing the non-linear functional transformation, and a linear autoencoder for sequences, implementing the memory component. The resulting architecture can be efficiently trained by building on closed-form solutions to linear optimization problems. Further, by exploiting equivalence results between feedforward and recurrent neural networks we devise a pretraining schema for the proposed architecture. Experiments on polyphonic music datasets show competitive results against gated recurrent networks and other state of the art models. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, showing how they can be implemented through different neural components. By building on such conceptualization, we introduce the Linear Memory Network, a recurrent model comprising a feedforward neural network, realizing the non-linear functional transformation, and a linear autoencoder for sequences, implementing the memory component. The resulting architecture can be efficiently trained by building on closed-form solutions to linear optimization problems. Further, by exploiting equivalence results between feedforward and recurrent neural networks we devise a pretraining schema for the proposed architecture. Experiments on polyphonic music datasets show competitive results against gated recurrent networks and other state of the art models. 
Close
https://arxiv.org/pdf/1811.03356.pdf
doi:10.1007/978-3-030-30487-4_40
Close
 Cosimo, Della Santina;  Giuseppe, Averta;  Visar, Arapi;  Alessandro, Settimi;  Giuseppe, Catalano Manuel;  Davide, Bacciu;  Antonio, Bicchi;  Matteo, Bianchi
Autonomous Grasping with SoftHands: Combining Human Inspiration, Deep Learning and Embodied Machine Intelligence Presentation 
11.09.2019.
BibTeX
@misc{automatica2019,

title = {Autonomous Grasping with SoftHands: Combining Human Inspiration, Deep Learning and Embodied Machine Intelligence},

author = {Della Santina Cosimo and Averta Giuseppe and Arapi Visar and Settimi Alessandro and Catalano Manuel Giuseppe and Bacciu Davide and Bicchi Antonio and Bianchi Matteo},

year  = {2019},

date = {2019-09-11},

booktitle = {Oral contribution to AUTOMATICA.IT 2019 },

keywords = {},

pubstate = {published},

tppubtype = {presentation}

}

Close
 Crecchi, Francesco;  Bacciu, Davide;  Biggio, Battista
Detecting Black-box Adversarial Examples through Nonlinear Dimensionality Reduction Conference 
Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'19), i6doc.com, Louvain-la-Neuve, Belgium, 2019.
Links | BibTeX
@conference{esann19Attacks,

title = {Detecting Black-box Adversarial Examples through Nonlinear Dimensionality Reduction},

author = {Francesco Crecchi and Davide Bacciu and Battista Biggio },

editor = {Michel Verleysen},

url = {https://arxiv.org/pdf/1904.13094.pdf},

year  = {2019},

date = {2019-04-24},

booktitle = {Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'19)},

publisher = {i6doc.com},

address = {Louvain-la-Neuve, Belgium},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/pdf/1904.13094.pdf
Close
 Bacciu, Davide;  Biggio, Battista;  Crecchi, Francesco;  Lisboa, Paulo J. G.;  Martin, José D.;  Oneto, Luca;  Vellido, Alfredo
Societal Issues in Machine Learning: When Learning from Data is Not Enough Conference 
Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'19), i6doc.com, Louvain-la-Neuve, Belgium, 2019.
Links | BibTeX
@conference{esann19Tutorial,

title = {Societal Issues in Machine Learning: When Learning from Data is Not Enough},

author = { Davide Bacciu and Battista Biggio and Francesco Crecchi and Paulo J. G. Lisboa and José D. Martin and Luca Oneto and Alfredo Vellido},

editor = {Michel Verleysen},

url = {https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2019-6.pdf},

year  = {2019},

date = {2019-04-24},

booktitle = {Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'19)},

publisher = {i6doc.com},

address = {Louvain-la-Neuve, Belgium},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://www.elen.ucl.ac.be/Proceedings/esann/esannpdf/es2019-6.pdf
Close
 Bacciu, Davide;  Crecchi, Francesco
Augmenting Recurrent Neural Networks Resilience by Dropout Journal Article 
In: IEEE Transactions on Neural Networs and Learning Systems, 2019.
Abstract | Links | BibTeX
@article{tnnnls_dropin2019,

title = {Augmenting Recurrent Neural Networks Resilience by Dropout},

author = {Davide Bacciu and Francesco Crecchi },

doi = {10.1109/TNNLS.2019.2899744},

year  = {2019},

date = {2019-03-31},

urldate = {2019-03-31},

journal = {IEEE Transactions on Neural Networs and Learning Systems},

abstract = {The paper discusses the simple idea that dropout regularization can be used to efficiently induce resiliency to missing inputs at prediction time in a generic neural network.  We show how the approach can be effective on tasks where imputation strategies often fail, namely involving recurrent neural networks and scenarios where whole sequences of input observations are missing. The experimental analysis provides an assessment of the accuracy-resiliency tradeoff in multiple recurrent models, including reservoir computing methods, and comprising real-world ambient intelligence and biomedical time series.},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
The paper discusses the simple idea that dropout regularization can be used to efficiently induce resiliency to missing inputs at prediction time in a generic neural network.  We show how the approach can be effective on tasks where imputation strategies often fail, namely involving recurrent neural networks and scenarios where whole sequences of input observations are missing. The experimental analysis provides an assessment of the accuracy-resiliency tradeoff in multiple recurrent models, including reservoir computing methods, and comprising real-world ambient intelligence and biomedical time series.
Close
doi:10.1109/TNNLS.2019.2899744
Close
 Cosimo, Della Santina;  Visar, Arapi;  Giuseppe, Averta;  Francesca, Damiani;  Gaia, Fiore;  Alessandro, Settimi;  Giuseppe, Catalano Manuel;  Davide, Bacciu;  Antonio, Bicchi;  Matteo, Bianchi
Learning from humans how to grasp: a data-driven architecture for autonomous grasping with anthropomorphic soft hands Journal Article 
In: IEEE Robotics and Automation Letters, pp. 1-8, 2019, ISSN: 2377-3766, (Also accepted for presentation at ICRA 2019).
Links | BibTeX
@article{ral2019,

title = {Learning from humans how to grasp: a data-driven architecture for autonomous grasping with anthropomorphic soft hands},

author = {Della Santina Cosimo and Arapi Visar and Averta Giuseppe and Damiani Francesca and Fiore Gaia and Settimi Alessandro and Catalano Manuel Giuseppe and Bacciu Davide and Bicchi Antonio and Bianchi Matteo},

url = {https://ieeexplore.ieee.org/document/8629968},

doi = {10.1109/LRA.2019.2896485},

issn = {2377-3766},

year  = {2019},

date = {2019-02-01},

journal = {IEEE Robotics and Automation Letters},

pages = {1-8},

note = {Also accepted for presentation at ICRA 2019},

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
https://ieeexplore.ieee.org/document/8629968
doi:10.1109/LRA.2019.2896485
Close
 Davide, Bacciu;  Antonio, Bruno
Deep Tree Transductions - A Short Survey Conference 
Proceedings of the  2019 INNS Big Data and Deep Learning (INNSBDDL 2019) , Recent Advances in Big Data and Deep Learning Springer International Publishing, 2019.
Abstract | Links | BibTeX
@conference{inns2019,

title = {Deep Tree Transductions - A Short Survey},

author = {Bacciu Davide and Bruno Antonio},

editor = {Luca Oneto and Nicol{`o} Navarin and Alessandro Sperduti and Davide Anguita},

url = {https://arxiv.org/abs/1902.01737},

doi = {10.1007/978-3-030-16841-4_25},

year  = {2019},

date = {2019-01-04},

urldate = {2019-01-04},

booktitle = {Proceedings of the  2019 INNS Big Data and Deep Learning (INNSBDDL 2019) },

pages = {236--245},

publisher = {Springer International Publishing},

series = {Recent Advances in Big Data and Deep Learning},

abstract = {The paper surveys recent extensions of the Long-Short Term Memory networks to handle tree structures from the perspective of learning non-trivial forms of isomorph structured transductions. It provides a discussion of modern TreeLSTM models, showing the effect of the bias induced by the direction of tree processing. An empirical analysis is performed on real-world benchmarks, highlighting how there is no single model adequate to effectively approach all transduction problems.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The paper surveys recent extensions of the Long-Short Term Memory networks to handle tree structures from the perspective of learning non-trivial forms of isomorph structured transductions. It provides a discussion of modern TreeLSTM models, showing the effect of the bias induced by the direction of tree processing. An empirical analysis is performed on real-world benchmarks, highlighting how there is no single model adequate to effectively approach all transduction problems.
Close
https://arxiv.org/abs/1902.01737
doi:10.1007/978-3-030-16841-4_25
Close
 Arapi, Visar;  Santina, Cosimo Della;  Bacciu, Davide;  Bianchi, Matteo;  Bicchi, Antonio
DeepDynamicHand: A deep neural architecture for labeling hand manipulation strategies in video sources exploiting temporal information  Journal Article 
In: Frontiers in Neurorobotics, vol. 12, pp. 86, 2018.
Abstract | Links | BibTeX
@article{frontNeurob18,

title = {DeepDynamicHand: A deep neural architecture for labeling hand manipulation strategies in video sources exploiting temporal information },

author = {Visar Arapi and Cosimo Della Santina and Davide Bacciu and Matteo Bianchi and Antonio Bicchi},

url = {https://www.frontiersin.org/articles/10.3389/fnbot.2018.00086/full},

doi = {10.3389/fnbot.2018.00086},

year  = {2018},

date = {2018-12-17},

urldate = {2018-12-17},

journal = {Frontiers in Neurorobotics},

volume = {12},

pages = {86},

abstract = {Humans are capable of complex manipulation interactions with the environment, relying on the intrinsic adaptability and compliance of their hands. Recently, soft robotic manipulation has attempted to reproduce such an extraordinary behavior, through the design of deformable yet robust end-effectors. To this goal, the investigation of human behavior has become crucial to correctly inform technological developments of robotic hands that can successfully exploit environmental constraint as humans actually do. Among the different tools robotics can leverage on to achieve this objective, deep learning has emerged as a promising approach for the study and then the implementation of neuro-scientific observations on the artificial side. However, current approaches tend to neglect the dynamic nature of hand pose recognition problems, limiting the effectiveness of these techniques in identifying sequences of manipulation primitives underpinning action generation, e.g. during purposeful interaction with the environment. In this work, we propose a vision-based supervised Hand Pose Recognition method which, for the first time, takes into account temporal information to identify meaningful sequences of actions in grasping and manipulation tasks . More specifically, we apply Deep Neural Networks to automatically learn features from hand posture images that consist of frames extracted from grasping and manipulation task videos with objects and external environmental constraints. For training purposes, videos are divided into intervals, each associated to a specific action by a human supervisor. The proposed algorithm combines a Convolutional Neural Network to detect the hand within each video frame and a Recurrent Neural Network to predict the hand action in the current frame, while taking into consideration the history of actions performed in the previous frames. Experimental validation has been performed on two datasets of dynamic hand-centric strategies, where subjects regularly interact with objects and environment. Proposed architecture achieved a very good classification accuracy on both datasets, reaching performance up to 94%, and outperforming state of the art techniques. The outcomes of this study can be successfully applied to robotics, e.g for planning and control of soft anthropomorphic manipulators. },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Humans are capable of complex manipulation interactions with the environment, relying on the intrinsic adaptability and compliance of their hands. Recently, soft robotic manipulation has attempted to reproduce such an extraordinary behavior, through the design of deformable yet robust end-effectors. To this goal, the investigation of human behavior has become crucial to correctly inform technological developments of robotic hands that can successfully exploit environmental constraint as humans actually do. Among the different tools robotics can leverage on to achieve this objective, deep learning has emerged as a promising approach for the study and then the implementation of neuro-scientific observations on the artificial side. However, current approaches tend to neglect the dynamic nature of hand pose recognition problems, limiting the effectiveness of these techniques in identifying sequences of manipulation primitives underpinning action generation, e.g. during purposeful interaction with the environment. In this work, we propose a vision-based supervised Hand Pose Recognition method which, for the first time, takes into account temporal information to identify meaningful sequences of actions in grasping and manipulation tasks . More specifically, we apply Deep Neural Networks to automatically learn features from hand posture images that consist of frames extracted from grasping and manipulation task videos with objects and external environmental constraints. For training purposes, videos are divided into intervals, each associated to a specific action by a human supervisor. The proposed algorithm combines a Convolutional Neural Network to detect the hand within each video frame and a Recurrent Neural Network to predict the hand action in the current frame, while taking into consideration the history of actions performed in the previous frames. Experimental validation has been performed on two datasets of dynamic hand-centric strategies, where subjects regularly interact with objects and environment. Proposed architecture achieved a very good classification accuracy on both datasets, reaching performance up to 94%, and outperforming state of the art techniques. The outcomes of this study can be successfully applied to robotics, e.g for planning and control of soft anthropomorphic manipulators. 
Close
https://www.frontiersin.org/articles/10.3389/fnbot.2018.00086/full
doi:10.3389/fnbot.2018.00086
Close
 Davide, Bacciu;  Antonio, Bruno
Text Summarization as Tree Transduction by Top-Down TreeLSTM Conference 
Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI'18), IEEE, 2018.
Abstract | Links | BibTeX
@conference{ssci2018,

title = {Text Summarization as Tree Transduction by Top-Down TreeLSTM},

author = {Bacciu Davide and Bruno Antonio},

url = {https://arxiv.org/abs/1809.09096},

doi = {10.1109/SSCI.2018.8628873},

year  = {2018},

date = {2018-11-18},

urldate = {2018-11-18},

booktitle = {Proceedings of the 2018 IEEE Symposium Series on Computational Intelligence (SSCI'18)},

pages = {1411-1418},

publisher = {IEEE},

abstract = {Extractive compression is a challenging natural language processing problem. This work contributes by formulating neural extractive compression as a parse tree transduction problem, rather than a sequence transduction task. Motivated by this, we introduce a deep neural model for learning structure-to-substructure tree transductions by extending the standard Long Short-Term Memory, considering the parent-child relationships in the structural recursion. The proposed model can achieve state of the art performance on sentence compression benchmarks, both in terms of accuracy and compression rate. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Extractive compression is a challenging natural language processing problem. This work contributes by formulating neural extractive compression as a parse tree transduction problem, rather than a sequence transduction task. Motivated by this, we introduce a deep neural model for learning structure-to-substructure tree transductions by extending the standard Long Short-Term Memory, considering the parent-child relationships in the structural recursion. The proposed model can achieve state of the art performance on sentence compression benchmarks, both in terms of accuracy and compression rate. 
Close
https://arxiv.org/abs/1809.09096
doi:10.1109/SSCI.2018.8628873
Close
 Davide, Bacciu;  Federico, Errica;  Alessio, Micheli
Contextual Graph Markov Model: A Deep and Generative Approach to Graph Processing Conference 
Proceedings of the 35th International Conference on Machine Learning (ICML 2018), 2018.
Links | BibTeX
@conference{icml2018,

title = {Contextual Graph Markov Model: A Deep and Generative Approach to Graph Processing},

author = {Bacciu Davide and Errica Federico and Micheli Alessio},

url = {https://arxiv.org/abs/1805.10636},

year  = {2018},

date = {2018-07-11},

urldate = {2018-07-11},

booktitle = {Proceedings of the 35th International Conference on Machine Learning (ICML 2018)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/1805.10636
Close
 Davide, Bacciu;  Andrea, Bongiorno
Concentric ESN: Assessing the Effect of Modularity in Cycle Reservoirs Conference 
Proceedings  of the 2018 International Joint Conference on Neural Networks (IJCNN 2018) , IEEE, 2018.
Links | BibTeX
@conference{ijcnn2018,

title = {Concentric ESN: Assessing the Effect of Modularity in Cycle Reservoirs},

author = {Bacciu Davide and Bongiorno Andrea},

url = {https://arxiv.org/abs/1805.09244},

year  = {2018},

date = {2018-07-09},

urldate = {2018-07-09},

booktitle = {Proceedings  of the 2018 International Joint Conference on Neural Networks (IJCNN 2018) },

pages = {1-9},

publisher = {IEEE},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/1805.09244
Close
 Davide, Bacciu;  JG, Lisboa Paulo;  D, Martin Jose;  Ruxandra, Stoean;  Alfredo, Vellido
Bioinformatics and medicine in the era of deep learning Conference 
Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'18), i6doc.com, Louvain-la-Neuve, Belgium, 2018, ISBN: 978-287587047-6.
Abstract | Links | BibTeX
@conference{esann2018Tut,

title = {Bioinformatics and medicine in the era of deep learning},

author = {Bacciu Davide and Lisboa Paulo JG and Martin Jose D and Stoean Ruxandra and Vellido Alfredo},

editor = {Michel Verleysen},

url = {http://arxiv.org/abs/1802.09791},

isbn = {978-287587047-6},

year  = {2018},

date = {2018-04-26},

urldate = {2018-04-26},

booktitle = {Proceedings  of the European Symposium on Artificial Neural Networks,  Computational Intelligence and Machine Learning (ESANN'18)},

pages = {345-354},

publisher = {i6doc.com},

address = {Louvain-la-Neuve, Belgium},

abstract = {Many of the current scientific advances in the life sciences have their origin in the intensive use of data for knowledge discovery. In no area this is so clear as in bioinformatics, led by technological breakthroughs in data acquisition technologies. It has been argued that bioinformatics could quickly become the field of research generating the largest data repositories, beating other data-intensive areas such as high-energy physics or astroinformatics. Over the last decade, deep learning has become a disruptive advance in machine learning, giving new live to the long-standing connectionist paradigm in artificial intelligence. Deep learning methods are ideally suited to large-scale data and, therefore, they should be ideally suited to knowledge discovery in bioinformatics and biomedicine at large. In this brief paper, we review key aspects of the application of deep learning in bioinformatics and medicine, drawing from the themes covered by the contributions to an ESANN 2018 special session devoted to this topic.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Many of the current scientific advances in the life sciences have their origin in the intensive use of data for knowledge discovery. In no area this is so clear as in bioinformatics, led by technological breakthroughs in data acquisition technologies. It has been argued that bioinformatics could quickly become the field of research generating the largest data repositories, beating other data-intensive areas such as high-energy physics or astroinformatics. Over the last decade, deep learning has become a disruptive advance in machine learning, giving new live to the long-standing connectionist paradigm in artificial intelligence. Deep learning methods are ideally suited to large-scale data and, therefore, they should be ideally suited to knowledge discovery in bioinformatics and biomedicine at large. In this brief paper, we review key aspects of the application of deep learning in bioinformatics and medicine, drawing from the themes covered by the contributions to an ESANN 2018 special session devoted to this topic.
Close
http://arxiv.org/abs/1802.09791
Close
 Davide, Bacciu
Hidden Tree Markov Networks: Deep and Wide Learning for Structured Data Conference 
Proc. of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI'17), IEEE, 2017.
Links | BibTeX
@conference{dl2017,

title = {Hidden Tree Markov Networks: Deep and Wide Learning for Structured Data},

author = {Bacciu Davide},

url = {https://arxiv.org/abs/1711.07784},

year  = {2017},

date = {2017-11-27},

urldate = {2017-11-27},

booktitle = {Proc. of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI'17)},

publisher = {IEEE},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/1711.07784
Close
 Davide, Bacciu;  Francesco, Crecchi;  Davide, Morelli
DropIn: Making Neural Networks Robust to Missing Inputs by Dropout Conference 
Proceedings  of the 2017 International Joint Conference on Neural Networks (IJCNN 2017) , IEEE, 2017, ISBN: 978-1-5090-6182-2.
Abstract | Links | BibTeX
@conference{ijcnn2017,

title = {DropIn: Making Neural Networks Robust to Missing Inputs by Dropout},

author = {Bacciu Davide and Crecchi Francesco and Morelli Davide},

url = {https://arxiv.org/abs/1705.02643},

doi = {10.1109/IJCNN.2017.7966106},

isbn = {978-1-5090-6182-2},

year  = {2017},

date = {2017-05-19},

urldate = {2017-05-19},

booktitle = {Proceedings  of the 2017 International Joint Conference on Neural Networks (IJCNN 2017) },

pages = {2080-2087},

publisher = {IEEE},

abstract = {The paper presents a novel, principled approach to train recurrent neural networks from the Reservoir Computing family that are robust to missing part of the input features at prediction time. By building on the ensembling properties of Dropout regularization, we propose a methodology, named DropIn, which efficiently trains a neural model as a committee machine of subnetworks, each capable of predicting with a subset of the original input features. We discuss the application of the DropIn methodology in the context of Reservoir Computing models and targeting applications characterized by input sources that are unreliable or prone to be disconnected, such as in pervasive wireless sensor networks and ambient intelligence. We provide an experimental assessment using real-world data from such application domains, showing how the Dropin methodology allows to maintain predictive performances comparable to those of a model without missing features, even when 20%–50% of the inputs are not available.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The paper presents a novel, principled approach to train recurrent neural networks from the Reservoir Computing family that are robust to missing part of the input features at prediction time. By building on the ensembling properties of Dropout regularization, we propose a methodology, named DropIn, which efficiently trains a neural model as a committee machine of subnetworks, each capable of predicting with a subset of the original input features. We discuss the application of the DropIn methodology in the context of Reservoir Computing models and targeting applications characterized by input sources that are unreliable or prone to be disconnected, such as in pervasive wireless sensor networks and ambient intelligence. We provide an experimental assessment using real-world data from such application domains, showing how the Dropin methodology allows to maintain predictive performances comparable to those of a model without missing features, even when 20%–50% of the inputs are not available.
Close
https://arxiv.org/abs/1705.02643
doi:10.1109/IJCNN.2017.7966106
Close
 Davide, Bacciu;  Vincenzo, Gervasi;  Giuseppe, Prencipe
An Investigation into Cybernetic Humor, or: Can Machines Laugh? Conference 
Proceedings of  the 8th International Conference on Fun with Algorithms (FUN'16) , vol. 49, Leibniz International Proceedings in Informatics (LIPIcs) Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2016, ISSN: 1868-8969.
Abstract | Links | BibTeX
@conference{fun2016,

title = {An Investigation into Cybernetic Humor, or: Can Machines Laugh?},

author = {Bacciu Davide and Gervasi Vincenzo and Prencipe Giuseppe},

editor = {Erik D. Demaine and Fabrizio Grandoni},

url = {http://drops.dagstuhl.de/opus/volltexte/2016/5882},

doi = {10.4230/LIPIcs.FUN.2016.3},

issn = {1868-8969},

year  = {2016},

date = {2016-06-10},

booktitle = {Proceedings of  the 8th International Conference on Fun with Algorithms (FUN'16) },

volume = {49},

pages = {1-15},

publisher = {Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik},

series = {Leibniz International Proceedings in Informatics (LIPIcs)},

abstract = {The mechanisms of humour have been the subject of much study and investigation, starting with and up to our days. Much of this work is based on literary theories, put forward by some of the most eminent philosophers and thinkers of all times, or medical theories, investigating the impact of humor on brain activity or behaviour. Recent functional neuroimaging studies, for instance, have investigated the process of comprehending and appreciating humor by examining functional activity in distinctive regions of brains stimulated by joke corpora. Yet, there is precious little work on the computational side, possibly due to the less hilarious nature of computer scientists as compared to men of letters and sawbones. In this paper, we set to investigate whether literary theories of humour can stand the test of algorithmic laughter. Or, in other words, we ask ourselves the vexed question: Can machines laugh? We attempt to answer that question by testing whether an algorithm - namely, a neural network - can "understand" humour, and in particular whether it is possible to automatically identify abstractions that are predicted to be relevant by established literary theories about the mechanisms of humor. Notice that we do not focus here on distinguishing humorous from serious statements - a feat that is clearly way beyond the capabilities of the average human voter, not to mention the average machine - but rather on identifying the underlying mechanisms and triggers that are postulated to exist by literary theories, by verifying if similar mechanisms can be learned by machines. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The mechanisms of humour have been the subject of much study and investigation, starting with and up to our days. Much of this work is based on literary theories, put forward by some of the most eminent philosophers and thinkers of all times, or medical theories, investigating the impact of humor on brain activity or behaviour. Recent functional neuroimaging studies, for instance, have investigated the process of comprehending and appreciating humor by examining functional activity in distinctive regions of brains stimulated by joke corpora. Yet, there is precious little work on the computational side, possibly due to the less hilarious nature of computer scientists as compared to men of letters and sawbones. In this paper, we set to investigate whether literary theories of humour can stand the test of algorithmic laughter. Or, in other words, we ask ourselves the vexed question: Can machines laugh? We attempt to answer that question by testing whether an algorithm - namely, a neural network - can "understand" humour, and in particular whether it is possible to automatically identify abstractions that are predicted to be relevant by established literary theories about the mechanisms of humor. Notice that we do not focus here on distinguishing humorous from serious statements - a feat that is clearly way beyond the capabilities of the average human voter, not to mention the average machine - but rather on identifying the underlying mechanisms and triggers that are postulated to exist by literary theories, by verifying if similar mechanisms can be learned by machines. 
Close
http://drops.dagstuhl.de/opus/volltexte/2016/5882
doi:10.4230/LIPIcs.FUN.2016.3
Close
Davide Bacciu – Homepage

Full Professor – Dipartimento di Informatica, Università di Pisa