L3 oapi without macros #531

devinamatthews · 2021-08-15T03:19:04Z

Simplify the l3 oapi by removing the rather unnecessary macro layer. This is an example of how many parts of BLIS which are currently macros, but don't really need to be could be potentially improved (I guess it depends on your viewpoint though 😄).

… call gemmt instead. The induced methods for gemmt are currently missing but I imagine that is easy to fill in.

devinamatthews · 2021-09-10T17:35:58Z

@fgvanzee I think this is actually ready to merge now. All of the L3 operations have the full interface (_ex, nat, ind, etc.) and all of the tests past.

fgvanzee · 2021-09-10T19:56:11Z

I found a regression. Previously, calling herk (for example) with a C that is tagged BLIS_GENERAL (instead of BLIS_HERMITIAN) would have triggered a runtime error. It seems that this code no longer checks for the expected struc_t value. Was this intentional?

There are other examples, such as enforcing real-domain beta for herk and her2k, and real-domain alpha for herk.

devinamatthews · 2021-09-10T20:21:46Z

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

devinamatthews · 2021-09-10T20:24:14Z

~~Shouldn't gemmt be doing the check on C?~~ Duh of course not.

fgvanzee · 2021-09-10T20:41:27Z

I'm also seeing that you made a pretty big reorganization of the files that do the expert vs. non-expert API.

I think I'm going to need quite a bit more time to review this.

fgvanzee · 2021-09-10T20:42:23Z

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

devinamatthews · 2021-09-10T20:46:54Z

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

No, I just didn't know if this had to happen in _front specifically. I will add the check functions back in. BTW none of this is actually tested in the testsuite :).

fgvanzee · 2021-09-11T00:00:38Z

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

No, I just didn't know if this had to happen in _front specifically. I will add the check functions back in. BTW none of this is actually tested in the testsuite :).

I have confirmed that the only thing that happens between bli_l3_oapi.c and the calling of the _front() functions is getting the context and runtime structs squared away.

NOTE: previously these checks happened in the _front layer, where cntx is guaranteed to be valid. Now the checks happen earlier when cntx may be NULL. This causes a segfault in bli_check_sufficient_stack_buf_size. Because the stack size will be checked again later, we now return success in the stack size check when cntx == NULL.

devinamatthews · 2021-09-13T16:56:09Z

@fgvanzee I added back the error checking.

# Conflicts: # frame/3/bli_l3.h # frame/3/bli_l3_ind.c # frame/3/bli_l3_oapi.c # frame/3/bli_l3_oapi.h # frame/3/bli_l3_oapi_ex.c # frame/3/bli_l3_tapi_ex.h # frame/3/gemmt/bli_gemmt_front.c # frame/3/her2k/bli_her2k_front.c # frame/3/herk/bli_herk_front.c # frame/3/syr2k/bli_syr2k_front.c # frame/3/syrk/bli_syrk_front.c # frame/ind/oapi/bli_l3_3m4m1m_oapi.c # frame/ind/oapi/bli_l3_ind_oapi.c # frame/ind/oapi/bli_l3_ind_oapi.h # frame/ind/oapi/bli_l3_nat_oapi.c # frame/ind/tapi/bli_l3_ind_tapi.c # kernels/armsve/3/armsve_asm_macros_scomplex.h

devinamatthews · 2021-11-08T18:47:16Z

@fgvanzee should be ready to merge.

fgvanzee · 2021-11-08T23:08:57Z

Thanks, I'll start reviewing it.

Details: - Added #include "bli_l3_schema.h" back to bli_l3.h, which got removed (probably on accident). - Removed "if ( !cntx )" early return statement from bli_check_sufficient_stack_buf_size() in bli_check.c. - Changed "n" parameter to syrk back to "k" in bli_l3_oapi_ex.h. - Converted /**/-style comments in bli_l3_oapi_ex.c to //-style comments. - Whitespace updates.

Details: - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change.

fgvanzee · 2021-11-10T14:43:27Z

@devinamatthews Turns out we did need the if ( !cntx ) conditional after all, because operations like herk call their _check() function before the cntx_t* is valid (which doesn't happen until gemmt is called).

BUT, I decided to solve it by moving the bli_check_sufficient_stack_buf_size() call into the gks, since that was where we agreed we would go eventually anyway.

fgvanzee · 2021-11-10T14:45:34Z

I'll put together a squash-merge commit log and then squash-merge later today.

Details: - Kept two older herk macrokernels in frame/3/herk/other, renaming the files and updating the definitions according to the rename of herk to gemmt. The files are now stored in frame/3/gemmt/other.

Details: - Renamed herk macrokernels and supporting files and functions to gemmt, which is possible since at the macrokernel level they are identical. Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal functions rather than cpp macros that instantiate multiple functions. Thanks to Devin Matthews for his efforts on this issue (flame#531). - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change.

Details: - Renamed herk macrokernels and supporting files and functions to gemmt, which is possible since at the macrokernel level they are identical. Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal functions rather than cpp macros that instantiate multiple functions. Thanks to Devin Matthews for his efforts on this issue (#531). - Check that the maximum stack buffer size is sufficiently large relative to the register blocksizes for each datatype, and do so when the context is initialized rather than when an operation is called. Note that with this change, users who pass in their own contexts into the expert interfaces currently will *not* have any checks performed. Thanks to Devin Matthews for suggesting this change. - (cherry picked from commit 28b0982)

Simplify the l3 oapi by removing the rather unnecessary macro layer.

204e52f

devinamatthews requested a review from fgvanzee August 15, 2021 03:19

devinamatthews added 2 commits August 15, 2021 13:25

Completely cull the framework code for syrk/herk/syr2k/her2k and just…

905b266

… call gemmt instead. The induced methods for gemmt are currently missing but I imagine that is easy to fill in.

Fix induced method handling for gemmt/syr2k/her2k/syrk/herk.

f57edcc

devinamatthews marked this pull request as ready for review September 10, 2021 17:35

Re-add bli_her2kind etc.

e515ee0

devinamatthews mentioned this pull request Oct 5, 2021

Enable user-customization of packing #549

Merged

devinamatthews added 2 commits November 2, 2021 15:07

Merge branch 'master' into l3_oapi_simplification

88ec414

fgvanzee added 2 commits November 9, 2021 17:20

Restored older herk/gemmt macrokernels in 'other'.

bb133d1

Details: - Kept two older herk macrokernels in frame/3/herk/other, renaming the files and updating the definitions according to the rename of herk to gemmt. The files are now stored in frame/3/gemmt/other.

fgvanzee merged commit 28b0982 into master Nov 10, 2021

devinamatthews deleted the l3_oapi_simplification branch November 12, 2021 15:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

L3 oapi without macros #531

L3 oapi without macros #531

devinamatthews commented Aug 15, 2021

devinamatthews commented Sep 10, 2021

fgvanzee commented Sep 10, 2021 •

edited

Loading

devinamatthews commented Sep 10, 2021

devinamatthews commented Sep 10, 2021 •

edited

Loading

fgvanzee commented Sep 10, 2021

fgvanzee commented Sep 10, 2021

devinamatthews commented Sep 10, 2021

fgvanzee commented Sep 11, 2021

devinamatthews commented Sep 13, 2021

devinamatthews commented Nov 8, 2021

fgvanzee commented Nov 8, 2021

fgvanzee commented Nov 10, 2021

fgvanzee commented Nov 10, 2021

L3 oapi without macros #531

L3 oapi without macros #531

Conversation

devinamatthews commented Aug 15, 2021

devinamatthews commented Sep 10, 2021

fgvanzee commented Sep 10, 2021 • edited Loading

devinamatthews commented Sep 10, 2021

devinamatthews commented Sep 10, 2021 • edited Loading

fgvanzee commented Sep 10, 2021

fgvanzee commented Sep 10, 2021

devinamatthews commented Sep 10, 2021

fgvanzee commented Sep 11, 2021

devinamatthews commented Sep 13, 2021

devinamatthews commented Nov 8, 2021

fgvanzee commented Nov 8, 2021

fgvanzee commented Nov 10, 2021

fgvanzee commented Nov 10, 2021

fgvanzee commented Sep 10, 2021 •

edited

Loading

devinamatthews commented Sep 10, 2021 •

edited

Loading