-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add parallel_for API with keyword struct #188
Conversation
Test this please |
30e3902
to
4c166a9
Compare
Test this please |
4c166a9
to
f732585
Compare
Test this please |
f732585
to
acef0e9
Compare
Test this please |
acef0e9
to
ef6b464
Compare
Test this please |
|
||
@kwdef mutable struct LaunchSpec{Backend} | ||
stream = default_stream(Backend) | ||
threads = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious why some are typed explicitly and some are not. FYI, this will default to Int64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the type of the stream is determined by the backend. Threads can be either an Integer or a tuple of 2 or 3 integers. same with blocks
@@ -93,10 +128,10 @@ end | |||
end | |||
|
|||
@testset "reduce" begin | |||
a = JACC.array([1 for i=1:10]) | |||
a = JACC.array([1 for i in 1:10]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe for later, do a formatting only PR. It seems these are not functional changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I did formatting as I was going with this one, so it applied to changes from prior to this PR.
A = JACC.ones(Float32, N, N, N) | ||
B = JACC.ones(Float32, N, N, N) | ||
C = JACC.zeros(Float32, N, N, N) | ||
JACC.parallel_for(JACC.launch_spec(; threads = (4, 4, 4)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess this is required to be a first argument since we are using Varargs. Another overloaded version, for later, could be similar to CUDA.jl (since AMDGPU.jl now follows it):
JACC.@lauch_spec threads = (16, 16) blocks = (1,1) sync=false JACC.parallel_for(N, f, x...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open an issue for this feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks @PhilipFackler . Remove the WIP when ready.
@williamfgc ready. And we can just close the other PR. |
This addresses #179 and #180