-
Notifications
You must be signed in to change notification settings - Fork 376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
L3 oapi without macros #531
Conversation
… call gemmt instead. The induced methods for gemmt are currently missing but I imagine that is easy to fill in.
@fgvanzee I think this is actually ready to merge now. All of the L3 operations have the full interface (_ex, nat, ind, etc.) and all of the tests past. |
I found a regression. Previously, calling There are other examples, such as enforcing real-domain |
If I reinstate the |
|
I'm also seeing that you made a pretty big reorganization of the files that do the expert vs. non-expert API. I think I'm going to need quite a bit more time to review this. |
I think that would be okay, yes. Was there some other alternative that you had in mind? |
No, I just didn't know if this had to happen in |
I have confirmed that the only thing that happens between |
NOTE: previously these checks happened in the _front layer, where cntx is guaranteed to be valid. Now the checks happen earlier when cntx may be NULL. This causes a segfault in bli_check_sufficient_stack_buf_size. Because the stack size will be checked again later, we now return success in the stack size check when cntx == NULL.
@fgvanzee I added back the error checking. |
# Conflicts: # frame/3/bli_l3.h # frame/3/bli_l3_ind.c # frame/3/bli_l3_oapi.c # frame/3/bli_l3_oapi.h # frame/3/bli_l3_oapi_ex.c # frame/3/bli_l3_tapi_ex.h # frame/3/gemmt/bli_gemmt_front.c # frame/3/her2k/bli_her2k_front.c # frame/3/herk/bli_herk_front.c # frame/3/syr2k/bli_syr2k_front.c # frame/3/syrk/bli_syrk_front.c # frame/ind/oapi/bli_l3_3m4m1m_oapi.c # frame/ind/oapi/bli_l3_ind_oapi.c # frame/ind/oapi/bli_l3_ind_oapi.h # frame/ind/oapi/bli_l3_nat_oapi.c # frame/ind/tapi/bli_l3_ind_tapi.c # kernels/armsve/3/armsve_asm_macros_scomplex.h
@fgvanzee should be ready to merge. |
Thanks, I'll start reviewing it. |
Details: - Added #include "bli_l3_schema.h" back to bli_l3.h, which got removed (probably on accident). - Removed "if ( !cntx )" early return statement from bli_check_sufficient_stack_buf_size() in bli_check.c. - Changed "n" parameter to syrk back to "k" in bli_l3_oapi_ex.h. - Converted /**/-style comments in bli_l3_oapi_ex.c to //-style comments. - Whitespace updates.
Details: - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change.
@devinamatthews Turns out we did need the BUT, I decided to solve it by moving the |
I'll put together a squash-merge commit log and then squash-merge later today. |
Details: - Kept two older herk macrokernels in frame/3/herk/other, renaming the files and updating the definitions according to the rename of herk to gemmt. The files are now stored in frame/3/gemmt/other.
Details: - Renamed herk macrokernels and supporting files and functions to gemmt, which is possible since at the macrokernel level they are identical. Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal functions rather than cpp macros that instantiate multiple functions. Thanks to Devin Matthews for his efforts on this issue (flame#531). - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change.
Details: - Renamed herk macrokernels and supporting files and functions to gemmt, which is possible since at the macrokernel level they are identical. Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal functions rather than cpp macros that instantiate multiple functions. Thanks to Devin Matthews for his efforts on this issue (#531). - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change. - (cherry picked from commit 28b0982)
Simplify the l3 oapi by removing the rather unnecessary macro layer. This is an example of how many parts of BLIS which are currently macros, but don't really need to be could be potentially improved (I guess it depends on your viewpoint though 😄).