Skip to content
This repository has been archived by the owner on Jan 27, 2024. It is now read-only.

Two step training #10

Closed
Verythai opened this issue Nov 10, 2020 · 1 comment
Closed

Two step training #10

Verythai opened this issue Nov 10, 2020 · 1 comment

Comments

@Verythai
Copy link

Refer to the paper on http://www.statmt.org/wmt19/pdf/54/WMT12.pdf: "We separated the training process into two steps: the first phase for training a generic model, and the second phase to finetune the model. For the first phase, we trained the model with a union dataset that is the concatenation of eSCAPE-NMT-filtered, and the upsampled official training set by copying 20 times. After reaching the convergence point in the first phase, we fine-tuned the model by running the second phase using only the official training set."

and refer to site readme:

`train_src: train_data.tok.srcmt
train_tgt: train_data.tok.pe

valid_src: dev.tok.srcmt
valid_tgt: dev.tok.pe

save_data: prep-data

...`

How can I save the 2nd step training data to "pre-data" in preprocessing? It gets overwritten or simply with an addition on the pre-data?

Thank you.

@goncalomcorreia
Copy link
Collaborator

This is not the repository of the paper you referenced, so I'm not familiar with what the authors of that paper did.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants