Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add save_checkpoint arg for TIMM training to simplify validation #1701

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ZhengHongming888
Copy link
Contributor

@ZhengHongming888 ZhengHongming888 commented Jan 17, 2025

What does this PR do?

In order to simplify the validation testing effort here one arg is added to choose whether save_checkpoint is executed or not to save the validation time. The command below in default is not saving checkpoint per epoch.

python train_hpu_graph.py
--data-dir ./
--dataset hfds/johnowhitaker/imagenette2-320
--device 'hpu'
--model resnet50.a1_in1k
--train-split train
--val-split train
--dataset-download

if you want to save checkpoint per epoch you need add one more argument " --save_checkpoint " in the command.

In addition the README.md also is simplified for validation target.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

Copy link
Collaborator

@jiminha jiminha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@jiminha jiminha added the run-test Run CI for PRs from external contributors label Jan 17, 2025
examples/pytorch-image-models/README.md Outdated Show resolved Hide resolved
examples/pytorch-image-models/README.md Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run-test Run CI for PRs from external contributors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants