DL4Graph Assessment

A Fair Comparison of Graph Neural Networks for Graph Classification

The official Python implementation for our paper benchmarking Graph Neural Networks on classification tasks.

The code is maintained in the Github of my student Federico Errica, who is to be credited for the implementation together with Marco Podda. To download the code and the scripts necessary to replicate the experiments in the original paper, please go here.

The code is provided as is with no warranty and technical support. Please inform the authors of the original paper (details below) if you intend to redistribute the code.


If you find this code useful, please remember to cite:

Federico Errica, Marco Podda, Davide Bacciu, Alessio Micheli: A Fair Comparison of Graph Neural Networks for Graph Classification. Proceedings of the Eighth International Conference on Learning Representations (ICLR 2020), 2020.

BibTeX (Download)

title = {A Fair Comparison of Graph Neural Networks for Graph Classification},
author = {Federico Errica and Marco Podda and Davide Bacciu and Alessio Micheli},
url = {https://openreview.net/pdf?id=HygDF6NFPB, PDF
https://iclr.cc/virtual_2020/poster_HygDF6NFPB.html, Talk
https://github.com/diningphil/gnn-comparison, Code},
year  = {2020},
date = {2020-04-30},
booktitle = {Proceedings of the Eighth International Conference on Learning Representations (ICLR 2020)},
abstract = {Experimental reproducibility and replicability are critical topics in machine learning. Authors have often raised concerns about their lack in scientific publications to improve the quality of the field. Recently, the graph representation learning field has attracted the attention of a wide research community, which resulted in a large stream of works.
As such, several Graph Neural Network models have been developed to effectively tackle graph classification. However, experimental procedures often lack rigorousness and are hardly reproducible. Motivated by this, we provide an overview of common practices that should be avoided to fairly compare with the state of the art. To counter this troubling trend, we ran more than 47000 experiments in a controlled and uniform framework to re-evaluate five popular models across nine common benchmarks. Moreover, by comparing GNNs with structure-agnostic baselines we provide convincing evidence that, on some datasets, structural information has not been exploited yet. We believe that this work can contribute to the development of the graph learning field, by providing a much needed grounding for rigorous evaluations of graph classification models.},
keywords = {deep learning, deep learning for graphs, graph data, structured data processing},
pubstate = {published},
tppubtype = {conference}