Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix profiling execute_multipass #2239

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

rjodinchr
Copy link
Contributor

  • fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the proper size

  • clamp localThreads[2] as for localThreads[0] and localThreads[2]

  • clamp all localThreads elements in regard of CL_MAX_WORK_GROUP_SIZE

  • fix the size using to create/read the output buffer

Fix #2238

@rjodinchr rjodinchr force-pushed the pr/execute-multipass branch 2 times, most recently from b9801df to b6193a8 Compare January 22, 2025 13:52
- fix clGetDeviceInfo(CL_DEVICE_MAX_WORK_ITEM_SIZES) by using the
proper size

- clamp all localThreads elements with regard to CL_MAX_WORK_GROUP_SIZE

- fix the size using to create/read the output buffer

Fix KhronosGroup#2238
Copy link
Contributor

@bashbaug bashbaug left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me.

I was initially confused by the allocation, because it's allocating based on nChannels even though the kernel unconditionally writes four channels, but I guess it's OK because because we only ever call this function when nChannels is four.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

profiling execute_multipass issues
3 participants