Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L3 oapi without macros #531

Merged
merged 10 commits into from
Nov 10, 2021
Merged

L3 oapi without macros #531

merged 10 commits into from
Nov 10, 2021

Conversation

devinamatthews
Copy link
Member

Simplify the l3 oapi by removing the rather unnecessary macro layer. This is an example of how many parts of BLIS which are currently macros, but don't really need to be could be potentially improved (I guess it depends on your viewpoint though 😄).

… call gemmt instead.

The induced methods for gemmt are currently missing but I imagine that is easy to fill in.
@devinamatthews devinamatthews marked this pull request as ready for review September 10, 2021 17:35
@devinamatthews
Copy link
Member Author

@fgvanzee I think this is actually ready to merge now. All of the L3 operations have the full interface (_ex, nat, ind, etc.) and all of the tests past.

@fgvanzee
Copy link
Member

fgvanzee commented Sep 10, 2021

I found a regression. Previously, calling herk (for example) with a C that is tagged BLIS_GENERAL (instead of BLIS_HERMITIAN) would have triggered a runtime error. It seems that this code no longer checks for the expected struc_t value. Was this intentional?

There are other examples, such as enforcing real-domain beta for herk and her2k, and real-domain alpha for herk.

@devinamatthews
Copy link
Member Author

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

@devinamatthews
Copy link
Member Author

devinamatthews commented Sep 10, 2021

Shouldn't gemmt be doing the check on C? Duh of course not.

@fgvanzee
Copy link
Member

I'm also seeing that you made a pretty big reorganization of the files that do the expert vs. non-expert API.

I think I'm going to need quite a bit more time to review this.

@fgvanzee
Copy link
Member

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

@devinamatthews
Copy link
Member Author

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

No, I just didn't know if this had to happen in _front specifically. I will add the check functions back in. BTW none of this is actually tested in the testsuite :).

@fgvanzee
Copy link
Member

If I reinstate the _check routines, is calling them from the oapi layer sufficient?

I think that would be okay, yes. Was there some other alternative that you had in mind?

No, I just didn't know if this had to happen in _front specifically. I will add the check functions back in. BTW none of this is actually tested in the testsuite :).

I have confirmed that the only thing that happens between bli_l3_oapi.c and the calling of the _front() functions is getting the context and runtime structs squared away.

NOTE: previously these checks happened in the _front layer, where cntx is guaranteed to be valid. Now the checks happen earlier when cntx may be NULL. This causes a segfault in bli_check_sufficient_stack_buf_size. Because the stack size will be checked again later, we now return success in the stack size check when cntx == NULL.
@devinamatthews
Copy link
Member Author

@fgvanzee I added back the error checking.

# Conflicts:
#	frame/3/bli_l3.h
#	frame/3/bli_l3_ind.c
#	frame/3/bli_l3_oapi.c
#	frame/3/bli_l3_oapi.h
#	frame/3/bli_l3_oapi_ex.c
#	frame/3/bli_l3_tapi_ex.h
#	frame/3/gemmt/bli_gemmt_front.c
#	frame/3/her2k/bli_her2k_front.c
#	frame/3/herk/bli_herk_front.c
#	frame/3/syr2k/bli_syr2k_front.c
#	frame/3/syrk/bli_syrk_front.c
#	frame/ind/oapi/bli_l3_3m4m1m_oapi.c
#	frame/ind/oapi/bli_l3_ind_oapi.c
#	frame/ind/oapi/bli_l3_ind_oapi.h
#	frame/ind/oapi/bli_l3_nat_oapi.c
#	frame/ind/tapi/bli_l3_ind_tapi.c
#	kernels/armsve/3/armsve_asm_macros_scomplex.h
@devinamatthews
Copy link
Member Author

@fgvanzee should be ready to merge.

@fgvanzee
Copy link
Member

fgvanzee commented Nov 8, 2021

Thanks, I'll start reviewing it.

Details:
- Added #include "bli_l3_schema.h" back to bli_l3.h, which got removed
  (probably on accident).
- Removed "if ( !cntx )" early return statement from
  bli_check_sufficient_stack_buf_size() in bli_check.c.
- Changed "n" parameter to syrk back to "k" in bli_l3_oapi_ex.h.
- Converted /**/-style comments in bli_l3_oapi_ex.c to //-style
  comments.
- Whitespace updates.
Details:
- Check that the maximum stack buffer size is sufficiently large
  relative to the register blocksizes for each datatype, and do so when
  the context is initialized rather than when an operation is called.
  Note that with this change, users who pass in their own contexts into
  the expert interfaces currently will *not* have any checks performed.
  Thanks to Devin Matthews for suggesting this change.
@fgvanzee
Copy link
Member

@devinamatthews Turns out we did need the if ( !cntx ) conditional after all, because operations like herk call their _check() function before the cntx_t* is valid (which doesn't happen until gemmt is called).

BUT, I decided to solve it by moving the bli_check_sufficient_stack_buf_size() call into the gks, since that was where we agreed we would go eventually anyway.

@fgvanzee
Copy link
Member

I'll put together a squash-merge commit log and then squash-merge later today.

Details:
- Kept two older herk macrokernels in frame/3/herk/other, renaming the
  files and updating the definitions according to the rename of herk
  to gemmt. The files are now stored in frame/3/gemmt/other.
@fgvanzee fgvanzee merged commit 28b0982 into master Nov 10, 2021
@devinamatthews devinamatthews deleted the l3_oapi_simplification branch November 12, 2021 15:41
dzambare pushed a commit to Meghana-vankadari/blis that referenced this pull request Jan 6, 2022
Details:
- Renamed herk macrokernels and supporting files and functions to gemmt, 
  which is possible since at the macrokernel level they are identical. 
  Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert
  level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal
  functions rather than cpp macros that instantiate multiple functions.
  Thanks to Devin Matthews for his efforts on this issue (flame#531).
- Check that the maximum stack buffer size is sufficiently large
  relative to the register blocksizes for each datatype, and do so when
  the context is initialized rather than when an operation is called.
  Note that with this change, users who pass in their own contexts into
  the expert interfaces currently will *not* have any checks performed.
  Thanks to Devin Matthews for suggesting this change.
fgvanzee added a commit that referenced this pull request Sep 10, 2022
Details:
- Renamed herk macrokernels and supporting files and functions to gemmt,
  which is possible since at the macrokernel level they are identical.
  Then recast herk/her2k/syrk/syr2k in terms of gemmt within the expert
  level-3 oapi (bli_l3_oapi_ex.c) while also redefining them as literal
  functions rather than cpp macros that instantiate multiple functions.
  Thanks to Devin Matthews for his efforts on this issue (#531).
- Check that the maximum stack buffer size is sufficiently large
  relative to the register blocksizes for each datatype, and do so when
  the context is initialized rather than when an operation is called.
  Note that with this change, users who pass in their own contexts into
  the expert interfaces currently will *not* have any checks performed.
  Thanks to Devin Matthews for suggesting this change.
- (cherry picked from commit 28b0982)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants