This work is based on source code of the paper "A Generalization of Transformer Networks to Graphs" by Vijay Prakash Dwivedi and Xavier Bresson, at AAAI'21 Workshop on Deep Learning on Graphs: Methods and Applications (DLG-AAAI'21).
Our adaptations are the following :
- New scripts to conduct other experiment.
- New arguments available with
main_SBMs_node_classification.py
. Some of them enable to save to model to test it on other dataset, some of them enable to add a renormalization of the laplacian encoding. - Creation of
test_SBM.py
to test a saved model on a specific dataset - Two jupyter notebook in the folder
data/SBMs/generate_datasets
The result of our adaptations are presented in the presentation
folder.
Two new scripts are available :
bash scripts/SBMs/script_main_SBMs_node_classification_PATTERN_500k_LaplRenorm.sh
bash scripts/SBMs/ script_main_SBMs_node_classification_PATTERN_500k_SizeEm.sh
- The first one provides empirical results on the effect of an homothety applied to the laplacian encoding.
- The second one provides empirical results on the effect of changing the dimension of the laplacian encoding (which means taking more or less eigenvectors of the laplacian for the embedding).
- New arguments for homothety : the argument
--renormalization_pos_enc
has been added to the argument parser ofmain_SBMs_node_classification.py
. This argument is the factor of the homothety applied to the laplacian encoding, and overwrites the one in the config file if provided. - New argument to save the model or load one : this enables to save a model (either to train if again later or to test it on another dataset) and to load it to continue the training phase. Usage :
python main_SBMs_node_classification.py --save_model out/Models/modelName
python main_SBMs_node_classification.py --load_model out/Models/modelName
Two pretrained models are provided in out/Models (80 epoch, about about 3h execution time for each). Their configuration are the following :
For baseSMB-lapEnc
:
Dataset: SBM_PATTERN,
Model: GraphTransformer
params={'seed': 41, 'epochs': 80, 'batch_size': 20, 'init_lr': 0.0005, 'lr_reduce_factor': 0.5, 'lr_schedule_patience': 10, 'min_lr': 1e-06, 'weight_decay': 0.0, 'print_epoch_interval': 5, 'max_time': 24}
net_params={'L': 10, 'n_heads': 8, 'hidden_dim': 80, 'out_dim': 80, 'residual': True, 'readout': 'mean', 'in_feat_dropout': 0.0, 'dropout': 0.0, 'layer_norm': False, 'batch_norm': True, 'self_loop': False, 'lap_pos_enc': True, 'pos_enc_dim': 2, 'wl_pos_enc': False, 'full_graph': False, 'device': device(type='cuda'), 'gpu_id': 0, 'batch_size': 20, 'renormalization_pos_enc': 1.0, 'in_dim': 3, 'n_classes': 2, 'total_param': 522982}
Total Parameters: 522982
For baseBSM-NoLap
:
Dataset: SBM_PATTERN,
Model: GraphTransformer
params={'seed': 41, 'epochs': 80, 'batch_size': 20, 'init_lr': 0.0005, 'lr_reduce_factor': 0.5, 'lr_schedule_patience': 10, 'min_lr': 1e-06, 'weight_decay': 0.0, 'print_epoch_interval': 5, 'max_time': 24}
net_params={'L': 10, 'n_heads': 8, 'hidden_dim': 80, 'out_dim': 80, 'residual': True, 'readout': 'mean', 'in_feat_dropout': 0.0, 'dropout': 0.0, 'layer_norm': False, 'batch_norm': True, 'self_loop': False, 'lap_pos_enc': False, 'pos_enc_dim': 2, 'wl_pos_enc': False, 'full_graph': False, 'device': device(type='cuda'), 'gpu_id': 0, 'batch_size': 20, 'renormalization_pos_enc': 1.0, 'in_dim': 3, 'n_classes': 2, 'total_param': 522742}
Total Parameters: 522742
A test file test_SBM.py
has been created to test a model (saved as described previously) on another dataset. Usage :
python test_SBM.py --load_model out/Models/modelName --config config/configName --dataset data/SBMs/datasetName
This is a long operation (>5h). The two jupyter notebooks are in data/SBMs/generate_datasets
First execute generate_SBM_PATTERN.ipynb
and then prepare_SBM_PATTERN.ipynb
.
Four datasets can be downloaded with the following links. The parameter changed is q :
- https://filesender.renater.fr/?s=download&token=9022a4ba-fe2b-48e8-bdd3-7b23e18ac9e2
- https://filesender.renater.fr/?s=download&token=e6b8268f-b10f-4db5-97f8-e4a77589c2bd
- https://filesender.renater.fr/?s=download&token=6b0954ec-1096-4cb9-b300-6a7a81ab0e1d
- https://filesender.renater.fr/?s=download&token=6e16d46e-723e-4bd9-b862-a454447a34ab
Source code for the paper "A Generalization of Transformer Networks to Graphs" by Vijay Prakash Dwivedi and Xavier Bresson, at AAAI'21 Workshop on Deep Learning on Graphs: Methods and Applications (DLG-AAAI'21).
We propose a generalization of transformer neural network architecture for arbitrary graphs: Graph Transformer.
Compared to the Standard Transformer, the highlights of the presented architecture are:
- The attention mechanism is a function of neighborhood connectivity for each node in the graph.
- The position encoding is represented by Laplacian eigenvectors, which naturally generalize the sinusoidal positional encodings often used in NLP.
- The layer normalization is replaced by a batch normalization layer.
- The architecture is extended to have edge representation, which can be critical to tasks with rich information on the edges, or pairwise interactions (such as bond types in molecules, or relationship type in KGs. etc).
Figure: Block Diagram of Graph Transformer Architecture
This project is based on the benchmarking-gnns repository.
Follow these instructions to install the benchmark and setup the environment.
Proceed as follows to download the datasets used to evaluate Graph Transformer.
Use this page to run the codes and reproduce the published results.
📃 Paper on arXiv
📝 Blog on Towards Data Science
🎥 Video on YouTube
@article{dwivedi2021generalization,
title={A Generalization of Transformer Networks to Graphs},
author={Dwivedi, Vijay Prakash and Bresson, Xavier},
journal={AAAI Workshop on Deep Learning on Graphs: Methods and Applications},
year={2021}
}