Publications

Here you can find a consolidated (a.k.a. slowly updated) list of my publications. A frequently updated (and possibly noisy) list of works is available on my Google Scholar profile.
Please find below a short list of highlight publications for my recent activity.
Show all
 Valenti, Andrea;  Bacciu, Davide
Modular Representations for Weak Disentanglement Conference 
Proceedings of the 30th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2022), 2022.
Abstract | Links | BibTeX
@conference{Valenti2022c,

title = {Modular Representations for Weak Disentanglement},

author = {Andrea Valenti and Davide Bacciu},

editor = {Michel Verleysen},

url = {https://arxiv.org/pdf/2209.05336.pdf},

year  = {2022},

date = {2022-10-05},

urldate = {2022-10-05},

booktitle = {Proceedings of the 30th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2022)},

abstract = {The recently introduced weakly disentangled representations proposed to relax some constraints of the previous definitions of disentanglement, in exchange for more flexibility. However, at the moment, weak disentanglement can only be achieved by increasing the amount of supervision as the number of factors of variations of the data increase. In this paper, we introduce modular representations for weak disentanglement, a novel method that allows to keep the amount of supervised information constant with respect the number of generative factors. The experiments shows that models using modular representations can increase their performance with respect to previous work without the need of additional supervision.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The recently introduced weakly disentangled representations proposed to relax some constraints of the previous definitions of disentanglement, in exchange for more flexibility. However, at the moment, weak disentanglement can only be achieved by increasing the amount of supervision as the number of factors of variations of the data increase. In this paper, we introduce modular representations for weak disentanglement, a novel method that allows to keep the amount of supervised information constant with respect the number of generative factors. The experiments shows that models using modular representations can increase their performance with respect to previous work without the need of additional supervision.
Close
https://arxiv.org/pdf/2209.05336.pdf
Close
 Valenti, Andrea;  Bacciu, Davide
 Leveraging Relational Information for Learning Weakly Disentangled Representations  Conference 
Proceedings of the 2022 IEEE World Congress on Computational Intelligence, IEEE, 2022.
Abstract | Links | BibTeX
@conference{Valenti2022,

title = { Leveraging Relational Information for Learning Weakly Disentangled Representations },

author = {Andrea Valenti and Davide Bacciu

},

url = {https://arxiv.org/abs/2205.10056, Arxiv},

year  = {2022},

date = {2022-07-18},

urldate = {2022-07-18},

booktitle = {Proceedings of the 2022 IEEE World Congress on Computational Intelligence},

publisher = {IEEE},

abstract = {Disentanglement is a difficult property to enforce in neural representations. This might be due, in part, to a formalization of the disentanglement problem that focuses too heavily on separating relevant factors of variation of the data in single isolated dimensions of the neural representation. We argue that such a definition might be too restrictive and not necessarily beneficial in terms of downstream tasks. In this work, we present an alternative view over learning (weakly) disentangled representations, which leverages concepts from relational learning. We identify the regions of the latent space that correspond to specific instances of generative factors, and we learn the relationships among these regions in order to perform controlled changes to the latent codes. We also introduce a compound generative model that implements such a weak disentanglement approach. Our experiments shows that the learned representations can separate the relevant factors of variation in the data, while preserving the information needed for effectively generating high quality data samples.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Disentanglement is a difficult property to enforce in neural representations. This might be due, in part, to a formalization of the disentanglement problem that focuses too heavily on separating relevant factors of variation of the data in single isolated dimensions of the neural representation. We argue that such a definition might be too restrictive and not necessarily beneficial in terms of downstream tasks. In this work, we present an alternative view over learning (weakly) disentangled representations, which leverages concepts from relational learning. We identify the regions of the latent space that correspond to specific instances of generative factors, and we learn the relationships among these regions in order to perform controlled changes to the latent codes. We also introduce a compound generative model that implements such a weak disentanglement approach. Our experiments shows that the learned representations can separate the relevant factors of variation in the data, while preserving the information needed for effectively generating high quality data samples.
Close
Arxiv
Close
 Carta, Antonio;  Sperduti, Alessandro;  Bacciu, Davide
Encoding-based Memory for Recurrent Neural Networks Journal Article 
In: Neurocomputing, vol. 456, pp. 407-420, 2021.
Abstract | Links | BibTeX
@article{Carta2021b,

title = {Encoding-based Memory for Recurrent Neural Networks},

author = {Antonio Carta and Alessandro Sperduti and Davide Bacciu},

url = {https://arxiv.org/abs/2001.11771, Arxiv},

doi = {10.1016/j.neucom.2021.04.051},

year  = {2021},

date = {2021-10-07},

urldate = {2021-10-07},

journal = {Neurocomputing},

volume = {456},

pages = {407-420},

publisher = {Elsevier},

abstract = {Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences. We extend the memorization component with a modular memory that encodes the hidden state sequence at different sampling frequencies. Additionally, we provide a specialized training algorithm that initializes the memory to efficiently encode the hidden activations of the network. The experimental results on synthetic and real-world datasets show that specializing the training algorithm to train the memorization component always improves the final performance whenever the memorization of long sequences is necessary to solve the problem. },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Learning to solve sequential tasks with recurrent models requires the ability to memorize long sequences and to extract task-relevant features from them. In this paper, we study the memorization subtask from the point of view of the design and training of recurrent neural networks. We propose a new model, the Linear Memory Network, which features an encoding-based memorization component built with a linear autoencoder for sequences. We extend the memorization component with a modular memory that encodes the hidden state sequence at different sampling frequencies. Additionally, we provide a specialized training algorithm that initializes the memory to efficiently encode the hidden activations of the network. The experimental results on synthetic and real-world datasets show that specializing the training algorithm to train the memorization component always improves the final performance whenever the memorization of long sequences is necessary to solve the problem. 
Close
Arxiv
doi:10.1016/j.neucom.2021.04.051
Close
 Valenti, Andrea;  Berti, Stefano;  Bacciu, Davide
Calliope - A Polyphonic Music Transformer Conference 
Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021), 2021.
Links | BibTeX
@conference{Valenti2021b,

title = {Calliope - A Polyphonic Music Transformer},

author = {Andrea Valenti and Stefano Berti and Davide Bacciu},

editor = {Michel Verleysen},

url = { The polyphonic nature of music makes the application of deep learning to music modelling a challenging task. On the other hand, the Transformer architecture seems to be a good fit for this kind of data. In this work, we present Calliope, a novel autoencoder model based on Transformers for the efficient modelling of multi-track sequences of polyphonic music. The experiments show that our model is able to improve the state of the art on musical sequence reconstruction and generation, with remarkably good results especially on long sequences. },

doi = {10.14428/esann/2021.ES2021-63},

year  = {2021},

date = {2021-10-06},

urldate = {2021-10-06},

booktitle = {Proceedings of the 29th European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning  (ESANN 2021)},

pages = {405-410},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
the Transformer architecture seems to be a good fit for this kind of data. In th[...]
doi:10.14428/esann/2021.ES2021-63
Close
Michele Barsotti Andrea Valenti, Davide Bacciu;  Ascari, Luca
A Deep Classifier for Upper-Limbs Motor Anticipation Tasks in an Online BCI Setting Journal Article 
In: Bioengineering , 2021.
Links | BibTeX
@article{Valenti2021,

title = {A Deep Classifier for Upper-Limbs Motor Anticipation Tasks in an Online BCI Setting},

author = {Andrea Valenti, Michele Barsotti, Davide Bacciu and Luca Ascari

},

url = {https://www.mdpi.com/2306-5354/8/2/21, Open Access },

doi = {10.3390/bioengineering8020021},

year  = {2021},

date = {2021-02-05},

urldate = {2021-02-05},

journal = {Bioengineering },

keywords = {},

pubstate = {published},

tppubtype = {article}

}

Close
Open Access 
doi:10.3390/bioengineering8020021
Close
 Valenti, Andrea;  Barsotti, Michele;  Brondi, Raffaello;  Bacciu, Davide;  Ascari, Luca
ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs Conference 
Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), IEEE, 2020.
Abstract | Links | BibTeX
@conference{smc2020,

title = {ROS-Neuro Integration of Deep Convolutional Autoencoders for EEG Signal Compression in Real-time BCIs},

author = {Andrea Valenti and Michele Barsotti and Raffaello Brondi and Davide Bacciu and Luca Ascari},

url = {https://arxiv.org/abs/2008.13485, Arxiv},

year  = {2020},

date = {2020-10-11},

urldate = {2020-10-11},

booktitle = {Proceedings of the 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)},

publisher = {IEEE},

abstract = {     Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
     Typical EEG-based BCI applications require the computation of complex functions over the noisy EEG channels to be carried out in an efficient way. Deep learning algorithms are capable of learning flexible nonlinear functions directly from data, and their constant processing latency is perfect for their deployment into online BCI systems. However, it is crucial for the jitter of the processing system to be as low as possible, in order to avoid unpredictable behaviour that can ruin the system's overall usability. In this paper, we present a novel encoding method, based on on deep convolutional autoencoders, that is able to perform efficient compression of the raw EEG inputs. We deploy our model in a ROS-Neuro node, thus making it suitable for the integration in ROS-based BCI and robotic systems in real world scenarios. The experimental results show that our system is capable to generate meaningful compressed encoding preserving to original information contained in the raw input. They also show that the ROS-Neuro node is able to produce such encodings at a steady rate, with minimal jitter. We believe that our system can represent an important step towards the development of an effective BCI processing pipeline fully standardized in ROS-Neuro framework. 
Close
Arxiv
Close
 Valenti, Andrea;  Carta, Antonio;  Bacciu, Davide
 Learning a Latent Space of Style-Aware Music Representations by Adversarial Autoencoders Conference 
Proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020), 2020.
Links | BibTeX
@conference{ecai2020,

title = { Learning a Latent Space of Style-Aware Music Representations by Adversarial Autoencoders},

author = {Andrea Valenti and Antonio Carta and Davide Bacciu},

url = {https://arxiv.org/abs/2001.05494},

year  = {2020},

date = {2020-06-08},

booktitle = {Proceedings of the 24th European Conference on Artificial Intelligence (ECAI 2020)},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
https://arxiv.org/abs/2001.05494
Close
 Carta, Antonio;  Sperduti, Alessandro;  Bacciu, Davide
Incremental training of a recurrent neural network exploiting a multi-scale dynamic memory Conference 
Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020 (ECML-PKDD 2020), Springer International Publishing, 2020.
Abstract | BibTeX
@conference{ecml2020LMN,

title = {Incremental training of a recurrent neural network exploiting a multi-scale dynamic memory},

author = {Antonio Carta and Alessandro Sperduti and Davide Bacciu},

year  = {2020},

date = {2020-06-05},

urldate = {2020-06-05},

booktitle = {Proceedings of the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2020 (ECML-PKDD 2020)},

publisher = {Springer International Publishing},

abstract = {The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of hidden activations. Experimental results on synthetic and real-world datasets on speech recognition and handwritten characters show that the modular architecture and the incremental training algorithm improve the ability of recurrent neural networks to capture long-term dependencies.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
The effectiveness of recurrent neural networks can be largely influenced by their ability to store into their dynamical memory information extracted from input sequences at different frequencies and timescales. Such a feature can be introduced into a neural architecture by an appropriate modularization of the dynamic memory. In this paper we propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. First, we show how to extend the architecture of a simple RNN by separating its hidden state into different modules, each subsampling the network hidden activations at different frequencies. Then, we discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies. Each new module works at a slower frequency than the previous ones and it is initialized to encode the subsampled sequence of hidden activations. Experimental results on synthetic and real-world datasets on speech recognition and handwritten characters show that the modular architecture and the incremental training algorithm improve the ability of recurrent neural networks to capture long-term dependencies.
Close
 Bacciu, Davide;  Carta, Antonio
Sequential Sentence Embeddings for Semantic Similarity Conference 
Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI'19), IEEE, 2019.
Abstract | Links | BibTeX
@conference{ssci19,

title = {Sequential Sentence Embeddings for Semantic Similarity},

author = {Davide Bacciu and Antonio Carta},

doi = {10.1109/SSCI44817.2019.9002824},

year  = {2019},

date = {2019-12-06},

urldate = {2019-12-06},

booktitle = {Proceedings of the 2019 IEEE Symposium Series on Computational Intelligence (SSCI'19)},

publisher = {IEEE},

abstract = {  Sentence embeddings are distributed representations of sentences intended to be general features to be effectively used as input for deep learning models across different natural language processing tasks.

  State-of-the-art sentence embeddings for semantic similarity are computed with a weighted average of pretrained word embeddings, hence completely ignoring the contribution of word ordering within a sentence in defining its semantics. We propose a novel approach to compute sentence embeddings for semantic similarity that exploits a linear autoencoder for sequences. The method can be trained in closed form and it is easy to fit on unlabeled sentences. Our method provides a grounded approach to identify and subtract common discourse from a sentence and its embedding, to remove associated uninformative features. Unlike similar methods in the literature (e.g. the popular Smooth Inverse Frequency approach), our method is able to account for word order. We show that our estimate of the common discourse vector improves the results on two different semantic similarity benchmarks when compared to related approaches from the literature.},

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
  Sentence embeddings are distributed representations of sentences intended to be general features to be effectively used as input for deep learning models across different natural language processing tasks.

  State-of-the-art sentence embeddings for semantic similarity are computed with a weighted average of pretrained word embeddings, hence completely ignoring the contribution of word ordering within a sentence in defining its semantics. We propose a novel approach to compute sentence embeddings for semantic similarity that exploits a linear autoencoder for sequences. The method can be trained in closed form and it is easy to fit on unlabeled sentences. Our method provides a grounded approach to identify and subtract common discourse from a sentence and its embedding, to remove associated uninformative features. Unlike similar methods in the literature (e.g. the popular Smooth Inverse Frequency approach), our method is able to account for word order. We show that our estimate of the common discourse vector improves the results on two different semantic similarity benchmarks when compared to related approaches from the literature.
Close
doi:10.1109/SSCI44817.2019.9002824
Close
 Bacciu, Davide;  Carta, Antonio;  Sperduti, Alessandro
Linear Memory Networks Conference 
Proceedings of the 28th International Conference on Artificial Neural Networks (ICANN 2019), , vol. 11727, Lecture Notes in Computer Science Springer-Verlag, 2019.
Abstract | Links | BibTeX
@conference{lmnArx18,

title = {Linear Memory Networks},

author = {Davide Bacciu and Antonio Carta and Alessandro Sperduti},

url = {https://arxiv.org/pdf/1811.03356.pdf},

doi = {10.1007/978-3-030-30487-4_40},

year  = {2019},

date = {2019-09-17},

urldate = {2019-09-17},

booktitle = {Proceedings of the 28th International Conference on Artificial Neural Networks (ICANN 2019), },

volume = {11727},

pages = {513-525 },

publisher = {Springer-Verlag},

series = {Lecture Notes in Computer Science},

abstract = {Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, showing how they can be implemented through different neural components. By building on such conceptualization, we introduce the Linear Memory Network, a recurrent model comprising a feedforward neural network, realizing the non-linear functional transformation, and a linear autoencoder for sequences, implementing the memory component. The resulting architecture can be efficiently trained by building on closed-form solutions to linear optimization problems. Further, by exploiting equivalence results between feedforward and recurrent neural networks we devise a pretraining schema for the proposed architecture. Experiments on polyphonic music datasets show competitive results against gated recurrent networks and other state of the art models. },

keywords = {},

pubstate = {published},

tppubtype = {conference}

}

Close
Recurrent neural networks can learn complex transduction problems that require maintaining and actively exploiting a memory of their inputs. Such models traditionally consider memory and input-output functionalities indissolubly entangled. We introduce a novel recurrent architecture based on the conceptual separation between the functional input-output transformation and the memory mechanism, showing how they can be implemented through different neural components. By building on such conceptualization, we introduce the Linear Memory Network, a recurrent model comprising a feedforward neural network, realizing the non-linear functional transformation, and a linear autoencoder for sequences, implementing the memory component. The resulting architecture can be efficiently trained by building on closed-form solutions to linear optimization problems. Further, by exploiting equivalence results between feedforward and recurrent neural networks we devise a pretraining schema for the proposed architecture. Experiments on polyphonic music datasets show competitive results against gated recurrent networks and other state of the art models. 
Close
https://arxiv.org/pdf/1811.03356.pdf
doi:10.1007/978-3-030-30487-4_40
Close
Davide Bacciu – Homepage

Full Professor – Dipartimento di Informatica, Università di Pisa