You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wondered if it was possible to keep all the checkpoints of a run, not just the best one.
This is interesting since it is currently not possible to run multiple evaluation jobs (to my knowledge) during training.
That way I could run the training with the evaluation job i want to do early stopping on and later evaluate potential other metrics over the course of the whole training process from the saved checkpoints.
Let me know if there is a flag or something that has this effect.
Thanks again!
The text was updated successfully, but these errors were encountered:
Correct, currently we don't support multiple evaluations during training (see #102 ).
But as you said, you can keep all the checkpoints with the following option:
checkpoint:
# In addition the the checkpoint of the last epoch (which is transient),
# create an additional checkpoint every this many epochs. Disable additional
# checkpoints with 0.
every: 5
# Keep this many most recent additional checkpoints.
keep: 3
Hi!
I wondered if it was possible to keep all the checkpoints of a run, not just the best one.
This is interesting since it is currently not possible to run multiple evaluation jobs (to my knowledge) during training.
That way I could run the training with the evaluation job i want to do early stopping on and later evaluate potential other metrics over the course of the whole training process from the saved checkpoints.
Let me know if there is a flag or something that has this effect.
Thanks again!
The text was updated successfully, but these errors were encountered: