About pearson score #3

joewellhe · 2017-12-23T07:29:27Z

I read your paper "Better Summarization Evaluation with Word Embeddings for ROUGE".
I'm very interested in your work. I try Rouge-score in the data the same with your, but the pearson score
not good as your.
e.g. pearson score of rouge2 with Pyr is 0.59 (computed by the matlab script provided by TAC)
however, in your paper, this score is 0.96.
Why you can get such a high score. If you do the pre-process in TAC data, Could you tell me how you do pre-process.

Lukecn1 · 2020-07-16T10:53:02Z

I have the exat same issue, I am not able to reproduce the high correlation scores between ROUGE and the human evaluations reported in the paper.

I get very similar scores to the one provided by OP.

Did you do any preprocessing and if so, is it possible to see this?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About pearson score #3

About pearson score #3

joewellhe commented Dec 23, 2017

Lukecn1 commented Jul 16, 2020

About pearson score #3

About pearson score #3

Comments

joewellhe commented Dec 23, 2017

Lukecn1 commented Jul 16, 2020