CC1 and CC2 Benchmarking #2

AtreusCorp · 2021-02-14T23:10:07Z

Hi there. I've read the associated paper and am very interested in the methodology. In particular (for a course project) I would like to see if I can improve on your results (even in a small way) but am having trouble seeing how to run the given code on a test set. My questions are as follows:

Does the s2 dataset (for which processing instructions are listed in the README) correspond to the CC2 set from the paper?
Does your code have a quick way of reproducing your test metrics on the imputed data? I would be very happy to implement something like this.
Is there a straightforward path to partitioning the given dataset for train / test purposes? Or perhaps managing 2 datasets, one for train and one for test?

Any clarity here would be very much appreciated. Thanks for the neat paper!

stefaniaebli · 2021-02-26T09:42:23Z

Hi, thank you for your interest and kind words. We would be very interested in seeing our results improved. I will answer your questions below.

The s2 dataset is a very large dataset. The data we work with - CC1 and CC2 - are 2 smaller sampled datasets from the s2 dataset (we described in the appendix of the paper how these 2 datasets have been subsampled). In particular, CC1 and CC2 are coauthorship complexes and you can use the script s2_4_bipartite_to_downsampled.py to obtain coauthorship complexes with the same procedure.
I can add the code we used for reproducing our test metric.
The straightforward way to do it is, as you said, maintaining 2 datasets. For instance we did it using as a train the CC1 dataset and as a test the CC2 dataset. With the script mentioned in point you can subsample from the s2 dataset many test and train datasets for this purpose.

Let me know if you have any further questions, I will be more than happy to discuss them.

cxw-droid · 2021-12-02T00:34:37Z

Hi,

Thanks for the interesting paper. Would you mind posting your testing code so possibly the paper results can be reproduced? I looked at the code impute_citations.py and it seems it only trains a model but does not test the accuracy etc..

Thanks.

AtreusCorp · 2021-12-10T03:28:03Z

@cxw-droid You might get some utility from my fork. I have implemented some of this, albeit with poor documentation.
https://github.com/AtreusCorp/simplicial_neural_networks

mdeff · 2021-12-10T13:46:16Z

I can add the code we used for reproducing our test metric.

Could we do that @stefaniaebli? It doesn't matter if it's ugly. Anything is better than nothing. :)

stefaniaebli · 2021-12-10T15:46:25Z

Hi, sorry for the late answer! @mdeff sure!

mdeff added the question Further information is requested label Mar 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CC1 and CC2 Benchmarking #2

CC1 and CC2 Benchmarking #2

AtreusCorp commented Feb 14, 2021 •

edited

Loading

stefaniaebli commented Feb 26, 2021

cxw-droid commented Dec 2, 2021

AtreusCorp commented Dec 10, 2021

mdeff commented Dec 10, 2021

stefaniaebli commented Dec 10, 2021

CC1 and CC2 Benchmarking #2

CC1 and CC2 Benchmarking #2

Comments

AtreusCorp commented Feb 14, 2021 • edited Loading

stefaniaebli commented Feb 26, 2021

cxw-droid commented Dec 2, 2021

AtreusCorp commented Dec 10, 2021

mdeff commented Dec 10, 2021

stefaniaebli commented Dec 10, 2021

AtreusCorp commented Feb 14, 2021 •

edited

Loading