You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Dear Sir:
Another question: How should I deal with the boundary condition in OP2-CUDA? The branch divergence problem always exists in cuda code when there's many if-conditions, especially for hydrodynamic simulation cases.
Thanks
Li Jian
The text was updated successfully, but these errors were encountered:
Hello,
There is no ideal way of doing this. You can either include if conditions in a kernel that includes the boundary, and check based on the index passed in by op_arg_idx, or you can launch separate ops_par_loops for the boundary (see e.g. update_halo.cpp in apps/c/CloverLeaf). Which one will perform better very much depends on your application.But if you want to easily switch between different boundary conditions, I suggest going with separate ops_par_loops (even though they might end up being slightly slower).
Best,
Istvan
Let me correct that (I mixed up OPS and OP2 here). For OP2, you can create sets which only include the boundary elements, and then do an op_par_loop only over those. Or you can create a dataset which flags which elements are on the boundary, and do the if conditions inside the kernel for an op_par_loop over the entire domain.
Dear Sir:
Another question: How should I deal with the boundary condition in OP2-CUDA? The branch divergence problem always exists in cuda code when there's many if-conditions, especially for hydrodynamic simulation cases.
Thanks
Li Jian
The text was updated successfully, but these errors were encountered: