Skip to content

Commit

Permalink
Add nsys command argument to profile cuda graph workload (#138)
Browse files Browse the repository at this point in the history
* Add nsys command argument to profile cuda graph workload; fix nsys profile export path

Signed-off-by: Guyue Huang <[email protected]>

* Revert a change to nsys profile path because the bug is fixed in tot

Signed-off-by: Guyue Huang <[email protected]>

* Fix test_slurm nsys command

Signed-off-by: Guyue Huang <[email protected]>

---------

Signed-off-by: Guyue Huang <[email protected]>
Signed-off-by: Guyue Huang <[email protected]>
Co-authored-by: Guyue Huang <[email protected]>
  • Loading branch information
guyueh1 and Guyue Huang authored Jan 23, 2025
1 parent 11e0d2f commit 5ed6128
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 0 deletions.
1 change: 1 addition & 0 deletions src/nemo_run/core/execution/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ def get_nsys_prefix(self, profile_dir: str) -> Optional[list[str]]:
"true",
"--capture-range=cudaProfilerApi",
"--capture-range-end=stop",
"--cuda-graph-trace=node",
]
return args

Expand Down
1 change: 1 addition & 0 deletions test/core/execution/test_slurm.py
Original file line number Diff line number Diff line change
Expand Up @@ -508,6 +508,7 @@ def test_dummy_batch_request_nsys(
"true",
"--capture-range=cudaProfilerApi",
"--capture-range-end=stop",
"--cuda-graph-trace=node",
]

def test_dummy_batch_request_warn(
Expand Down

0 comments on commit 5ed6128

Please sign in to comment.