-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Broadcasting and reshaped, transposed, CuArrays #228
Comments
Dup of JuliaGPU/Adapt.jl#21. With how Adapt.jl currently works, it's much too complicated/expensive to extend our support for array wrappers to doubly-wrapped arrays (here, ReshapedArray of Transpose of CuArray). |
OK, sorry I didn't search there. Is there a workaround? Like some mechanism to tell it to keep unwrapping, even if this can't be triggered automatically? I think what's happening in my second example is that one singly-wrapped CuArray pushes the whole broadcast to be done by CUDA, and then it's happy... is there or should there be a way to make this happen? zed = similar(mat,1) .= 0
zed .+ CUDA.exp.(reshape(transpose(mat), 2,1,2)) # ok! |
Unwrapping would be materializing the wrapper using Anyway, you've demonstrated a workaround yourself: make sure there's an actual CuArray (or a single-wrapped one) participating in the broadcast, The resulting broadcast style will then be the GPU one. When you do An alternative workaround is to override the broadcaststyle: julia> C = cu(ones(10,2));
julia> L = cu(ones(10,3));
julia> Meta.@lower reshape(C',1,2,10) .+ reshape(L', 3,1,10)
:($(Expr(:thunk, CodeInfo(
@ none within `top-level scope`
1 ─ %1 = var"'"(C)
│ %2 = reshape(%1, 1, 2, 10)
│ %3 = var"'"(L)
│ %4 = reshape(%3, 3, 1, 10)
│ %5 = Base.broadcasted(+, %2, %4)
│ %6 = Base.materialize(%5)
└── return %6
julia> a = reshape(C',1,2,10);
julia> b = reshape(L', 3,1,10);
julia> bc = Base.broadcasted(+, a, b);
julia> CUDA.allowscalar(false)
julia> Broadcast.materialize(bc)
ERROR: scalar getindex is disallowed
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] assertscalar(::String) at /home/tim/Julia/pkg/GPUArrays/src/host/indexing.jl:41
[3] getindex at /home/tim/Julia/pkg/GPUArrays/src/host/indexing.jl:96 [inlined]
[4] _getindex at ./abstractarray.jl:1066 [inlined]
[5] getindex at ./abstractarray.jl:1043 [inlined]
[6] getindex at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.5/LinearAlgebra/src/adjtrans.jl:190 [inlined]
[7] _unsafe_getindex_rs at ./reshapedarray.jl:249 [inlined]
[8] _unsafe_getindex at ./reshapedarray.jl:246 [inlined]
[9] getindex at ./reshapedarray.jl:234 [inlined]
[10] _getindex at ./abstractarray.jl:1083 [inlined]
[11] getindex at ./abstractarray.jl:1043 [inlined]
[12] _broadcast_getindex at ./broadcast.jl:614 [inlined]
[13] _getindex at ./broadcast.jl:644 [inlined]
[14] _broadcast_getindex at ./broadcast.jl:620 [inlined]
[15] getindex at ./broadcast.jl:575 [inlined]
[16] macro expansion at ./broadcast.jl:932 [inlined]
[17] macro expansion at ./simdloop.jl:77 [inlined]
[18] copyto! at ./broadcast.jl:931 [inlined]
[19] copyto! at ./broadcast.jl:886 [inlined]
[20] copy at ./broadcast.jl:862 [inlined]
[21] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.DefaultArrayStyle{3},Nothing,typeof(+),Tuple{Base.ReshapedArray{Float32,3,LinearAlgebra.Adjoint{Float32,CuArray{Float32,2,Nothing}},Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}},Base.ReshapedArray{Float32,3,LinearAlgebra.Adjoint{Float32,CuArray{Float32,2,Nothing}},Tuple{Base.MultiplicativeInverses.SignedMultiplicativeInverse{Int64}}}}}) at ./broadcast.jl:837
[22] top-level scope at REPL[51]:1
julia> bc = Base.broadcasted(CUDA.CuArrayStyle{2}(), +, a, b);
julia> Broadcast.materialize(bc); Maybe with some tooling this could be made practical for you ( |
Thanks for having a look, haven't got back to this. Re tooling, I was wondering whether But I also wonder why broadcasting can't see this itself. It is easy to recursively call |
That's an interesting thought! I'm wary of overloading |
OK, right I don't know if For the original issue here, mcabbott/TensorCast.jl#25, the ideal logic would be something that digs through wrappers, and returns an Array or a CuArray depending on what it sees. But the other use is broadcasting over things like ranges or CartesianIndices, in which case it cannot guess, you will have to specify if you want a CuArray. |
Some broadcasting operations work fine on
reshape(transpose(cu(
objects, but some fail:Something like this appears to be the cause of mcabbott/TensorCast.jl#25, where the failure happens only when broadcasting two such objects, just one is fine:
This is with CUDA v0.1.0, I get the same errors on CuArrays v2.2.1, and on CuArrays v1.7.2 the only change is this:
The text was updated successfully, but these errors were encountered: