Skip to content

Latest commit

 

History

History
161 lines (126 loc) · 7.94 KB

README.md

File metadata and controls

161 lines (126 loc) · 7.94 KB

Block-Approximated Exponential Random Graphs

This repository contains the source code, installation and use instructions for the methods presented in the paper: Block-Approximated Exponential Random Graphs. Instructions for replicating the experiments in the paper are also given.

We provide two different implementations, a Python version for the MaxEnt model with full structural constraints and a Matlab version for the MaxEnt using low rank approximations.

Installation

The Python version of MaxEnt can be installed directly from GitHub with:

$ pip install git+https://github.com/aida-ugent/MaxEntComb.git

It can be also installed from source with:

$ git clone https://github.com/aida-ugent/MaxEntComb.git
$ cd MaxEntComb
$ pip install -e .

Where -e means "editable" mode.

The MaxEnt Matlab version using low rank approximations requires Matlab R2018b or higher. The code also uses the open source minFunc package for unconstrained differentiable multivariate optimization by M. Schmidt. For convenience, this packages is already included in the /matlab folder.

Note: The Python MaxEntComb code was extensively tested under Python 2.7.x. We therefore recommend this setting in order to obtain the best possible results. The code also works for other version of Python >=3.6 but it has not been thoroughly tested.

Usage

Running MaxEnt Python:

A maxentcomb CLI command is added on pip install which runs the model. The command takes the following parameters:

  -h, --help                Show this help message and exit.
  --inputgraph [INPUTGRAPH] Input graph path.
  --output [OUTPUT]         Path where the embeddings will be stored. Default is `network.emb`.
  --tr_e [TR_E]             Path of the input train edges. Default is None (in this case returns embeddings)
  --tr_pred [TR_PRED]       Path where the train predictions will be stored. Default is `tr_pred.csv`
  --te_e [TE_E]             Path of the input test edges. Default is None.
  --te_pred [TE_PRED]       Path where the test predictions will be stored. Default is `te_pred.csv`
  --learning_rate LEARNING_RATE
                            Learning rate. Default is 0.1.
  --epochs EPOCHS           Training epochs. Default is 100.
  --grad_min_norm GRAD_MIN_NORM
                            Early stop if grad norm is below this value. Default is 0.0001.
  --optimizer OPTIMIZER     Optimizer to use. Options are `newton` and `grad_desc`. Default is `newton`.
  --memory MEMORY           The constraints on the memory to use. Options are `quadratic` and `linear`. 
                            Quadratic is considerably faster but used O(n^2) memory. Default is `quadratic`.
  --prior PRIOR [PRIOR ...] The prior to use. Options are `CN`, `AA`, `A3`, `JC`, `PA` or list combining them. 
                            Default is [`CN`].
  --delimiter DELIMITER     The delimiter used in the input edgelist. Output will use the same one. Default is `,`
  --dimension DIMENSION     Not used. Default is 1.
  --verbose                 Determines the verbosity level of the output. Default is False.

NOTE: The user can define new constraint matrices F. However, these need to implement the interfaces provided in the weighted_lin_constr.py file.

Examples of running the Python MaxEnt model are:

# Example 1: Outputs posterior matrix after calculation
maxentcomb --inputgraph ./graph.edgelist --prior 'CN' 'PA' --optimizer 'newton'
# Example 2: Outputs predictions for the edge pairs in the tr_e and te_e files
maxentcomb --inputgraph ./graph.edgelist --tr_e ./tr.edgelist --te_e ./te.edgelist --tr_pred './tr.out' --te_pred './te.out'

Running MaxEnt Matlab:

The file test.m provides an overview on how to run a complete pipeline using the Matlab MaxEnt code. This code generates a symmetric adjacency matrix and computes its spectral decomposition. Second and third order proximity matrices are then computed, as well as low rank approximations for Resource Allocation Index and Preferential Attachment. A cartesian product of these block approximation is then obtained and the MaxEnt model is fitted on this binning using the lbfgs solver from the minFunc package. Finally train and test predictions are computed for the input edge pairs.

In order to simplify the execution of the Matlab MaxEnt version, we also provide a pre-compiled file named test as well as a shell script run_test.sh for running it. The shell script takes the following parameters in this order: inputgraph, tr_e, te_e, tr_pred, te_pred, dim, bins, weights. The descriptions of the first 5 parameters are the same as for the Python version. As for the remaining ones:

dim:        FLOAT       Embedding dimensionality
bins:       FLOAT       Number of bins each F matrix is divided in.
weights:    LIST        The multiplicity for each high-order priximity of A. 
                        E.g. weights[1]*A^2 + weights[2]*A^3 ...

An example of running the Matlab MaxEnt version:

./run_test.sh /usr/local/MATLAB/MATLAB_Runtime/v95 graph.edgelist ./tr.edgelist ./te.edgelist './tr.out' './te.out' 8 100 [1,0.1]

Reproducing Experiments

In order to reproduce the experiments presented in the paper the following steps are necessary:

  1. Download and install the EvalNE library as instructed here

  2. Download and install the MaxEnt method as shown in the Installation section above

  3. Download and install the baseline methods. We recommend that each method is installed in a unique virtual environment to ensure that the right dependencies are used. For all methods except SDNE, we use the implementations provided by the original authors:

    For SDNE the implementation available here is the only one which has shown similar results to those in the original paper.

  4. Download the datasets:

  5. Download the .ini configuration files from the /experiments folder. Modify the dataset paths and method paths to match the respective installation directories and run each files as:

    python -m evalne ./experiments/conf_maxent.ini

NOTE: For AROPE a main.py file is required in order to run the evaluation through EvalNE. This file is also available in the /experiments folder.

NOTE2: To ensure that the same results reported in the paper are obtained, we recommend that both MaxEnt and the EvalNE library are installed in independent virtual environments under Python 2.7.x.

Citation

If you have found our research useful, please consider citing our paper, which is also available on arxiv:

@INPROCEEDINGS{9260034,
  author={F. {Adriaens} and A. {Mara} and J. {Lijffijt} and T. {De Bie}},
  booktitle={2020 IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA)}, 
  title={Block-Approximated Exponential Random Graphs}, 
  year={2020},
  pages={70-80},
  doi={10.1109/DSAA49011.2020.00019}}