Replies: 1 comment
-
Hi, no, currently this future is not implemented in the main branch, but there is support in PyTorch Lightning, so you have to tweek nemo exp_manager to properly handle S3 paths, and then PyTorch Lightning will do its magic. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
Our project is planning to leverage
exp_manager
(https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/core/exp_manager.html) for managing our experimental job tracker related configs, but we want to save and resume the checkpoints to S3.After going through the code for exp_manager, can I check with the NeMo team to understand that if checkpoints saving to S3 is supported? If it does, is this set at
explicit_log_dir
orexp_dir
? Would this be set atcheckpoint_callback_params
instead for MLFlowLogger?Thanks!
Beta Was this translation helpful? Give feedback.
All reactions