Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cuda Component Refactor #165

Closed

Conversation

Treece-Burgess
Copy link
Contributor

@Treece-Burgess Treece-Burgess commented Feb 26, 2024

Pull Request Description

This pull request addresses adding a device qualifier to the Cuda component. This will greatly decrease the number of native events that are output when running ./papi_native_avail. For example, if running on Guyot which has a total of eight A100 NVIDIA GPUs. The total number of native events that are output is 2,105,856 with output currently being formatted as:

===============================================================================
| cuda:::dram__bytes.avg:device=0                                              |     
|            # of bytes accessed in DRAM. Units=(bytes) Numpass=1              |     
--------------------------------------------------------------------------------

With adding a device qualifier to the Cuda component with this pull request, we would see the total number of native events on Guyot decreased to 263,232 with the updated output formatted as:

===============================================================================
| cuda:::dram__bytes.avg                                                       |
|            # of bytes accessed in DRAM. Units=(bytes) Numpass=1              |
|     :device=0                                                                |
|            Mandatory device qualifier [0,1,2,3,4,5,6,7]                      |
--------------------------------------------------------------------------------

Other example outputs will be shown below.

Hopper1 has a single NVIDIA GH200 GPU and output for that system appears as follows:

===============================================================================
| cuda:::FE_A.TriageCompute.gr__cycles_active.avg                              |
|            # of cycles where GR was active. Units=(sys_clks) Numpass=1       |
|     :device=0                                                                |
|            Mandatory device qualifier [0]                                    |
--------------------------------------------------------------------------------

Leconte has eight NVIDIA Tesla V100 GPUs and output for that system appears as follows:

===============================================================================
| cuda:::dram__bytes.avg                                                       |
|            # of bytes accessed in DRAM. Units=(bytes) Numpass=1              |
|     :device=0                                                                |
|            Mandatory device qualifier [0,1,2,3,4,5,6,7]                      |
--------------------------------------------------------------------------------

Author Checklist

  • Description
    Why this PR exists. Reference all relevant information, including background, issues, test failures, etc
  • Commits
    Commits are self contained and only do one thing
    Commits have a header of the form: module: short description
    Commits have a body (whenever relevant) containing a detailed description of the addressed problem and its solution
  • Tests
    The PR needs to pass all the tests

@Treece-Burgess Treece-Burgess self-assigned this Apr 19, 2024
@Treece-Burgess Treece-Burgess added the type-feature Issues that request a new feature or PRs that add a new feature label Apr 19, 2024
@Treece-Burgess Treece-Burgess marked this pull request as ready for review April 22, 2024 14:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature Issues that request a new feature or PRs that add a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant