Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions on Table 3 of the MASS paper? #149

Open
Epsilon-Lee opened this issue Jun 11, 2020 · 2 comments
Open

Questions on Table 3 of the MASS paper? #149

Epsilon-Lee opened this issue Jun 11, 2020 · 2 comments

Comments

@Epsilon-Lee
Copy link

Hi, dear authors:

I wonder how you conduct the DAE pretraining experiments showned in Table 3. (I only list en-fr results below)

Table 3. The BLEU score comparisons between MASS and other pre-training methods.

Method en-fr fr-en
BERT+LM 33.4 32.3
DAE 30.1 28.3
MASS 37.5 34.9

As you described in your paper,

"The second baseline is DAE, which simply uses denoising auto-encoder to pretrain the encoder and decoder."

So my questions are:

  1. Are you simply use the pair (c(x), x) from two language to pre-train the seq2seq model? That is the same DAE as in DAE loss + back-translation for finetuning of XLM.
  2. Are there any tricks to make it work?

I have done experiments on using DAE to pre-train the seq2seq model, but when I continue to train with (only) BT [1], I only get the following BLEU scores which is far less than your reported ones, so I wonder if I misunderstand some details.

Table. My run

Method en-fr fr-en
DAE 11.20 10.68

Please help me out here, thanks a lot.

Footnote.
[1] One difference from my training setting and yours is that: during fine-tuning, I only use bt loss instead of both DAE and BT.

@StillKeepTry
Copy link
Contributor

@Epsilon-Lee We conduct this ablation study by using this code. Using DAE in a pure-shared model will lead to identity mapping, you can try the older unsupervised nmt framework, which supports some modules unshared.

@Epsilon-Lee
Copy link
Author

Thanks a lot for your response, I will definitely have a try and then come back to report the figures!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants