Skip to content

Commit

Permalink
Make error checking level a thread-local variable.
Browse files Browse the repository at this point in the history
Previously, this was a global variable. Setting the value was synchronized via a mutex but reading the value was not. Of course, these accesses are almost certainly atomic, but there is still the possibility of one thread attempting to set the value and then reading the value set by another thread. For correct operation under user threading (e.g. pthreads), this should probably be thread-local with no mutex.
  • Loading branch information
devinamatthews committed Oct 2, 2021
1 parent c4fadf2 commit c493032
Showing 1 changed file with 3 additions and 15 deletions.
18 changes: 3 additions & 15 deletions frame/base/bli_error.c
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,7 @@
*/

#include "bli_lang_defs.h"
#include "blis.h"

// Internal array to hold error strings.
Expand Down Expand Up @@ -133,11 +134,8 @@ void bli_abort( void )

// -----------------------------------------------------------------------------

// A mutex to allow synchronous access to bli_err_chk_level.
static bli_pthread_mutex_t err_mutex = BLIS_PTHREAD_MUTEX_INITIALIZER;

// Current error checking level.
static errlev_t bli_err_chk_level = BLIS_FULL_ERROR_CHECKING;
static BLIS_THREAD_LOCAL errlev_t bli_err_chk_level = BLIS_FULL_ERROR_CHECKING;

This comment has been minimized.

Copy link
@hominhquan

hominhquan Oct 3, 2021

Contributor

Oh, I did not notice that BLIS now uses TLS. Hmm, what to say, without wanting to be rude, I strongly discourage using TLS in such a critical, high-valued library like BLIS, for following reasons:

  • TLS is, ofcourse supported on many systems/compilers, there may still some (embedded) platforms/compilers/OS'es which does not (or can not) support it natively/efficiently.

  • On the parallel-programming side, TLS is just a lazy way to "de-globalize" global variables which were spreading overtime and are causing problems on highly-parallel context nowadays. The correct approach, IMHO, is to design contextless library (library which acts accordingly to the arguments in API call), not contextful (library which acts based on arguments and some internal global variables)

  • TLS is ... special, it is not global nor local nor stack variables. TLS variables are visually written and used as global variables, but thought as (but not really) local/stack ones. Data-racing while using TLS is not impossible either. Combining with the fact that TLS may have different/weird behavior on different architectures, that will not arrange things.

  • Last question: Does the error-checking level need to be per-thread ? My intuition is "No". If it is something only on the control-path and not impacting performance or correctness, so it should be kept simple and safe as much as possible.

Those are my points of view, they might not apply on everywhere. We can open a question on Discord to gather feedback ?

This comment has been minimized.

Copy link
@devinamatthews

devinamatthews Oct 3, 2021

Author Member

One can debate the merits of making any particular setting thread-local, but I consider the rabbit hole already gone down due to bli_set_num_threads(). This function is a) clearly the way that users expect to interact with BLIS thread settings (and the only way they can when using the BLAS/CBLAS apis, and b) clearly in need of TLS. For embedded systems BLIS already #defines BLIS_THREAD_LOCAL to nothing, so there is not an issue there.

This comment has been minimized.

Copy link
@devinamatthews

devinamatthews Oct 3, 2021

Author Member

On the other hand, I think it would be possible to (ab)use the BLIS interface (not BLAS/CBLAS) to support either stateful or stateless execution by expanding the role of the rntm_t and/or cntx_t structures.


errlev_t bli_error_checking_level( void )
{
Expand All @@ -151,17 +149,7 @@ void bli_error_checking_level_set( errlev_t new_level )
e_val = bli_check_valid_error_level( new_level );
bli_check_error_code( e_val );

// Acquire the mutex protecting bli_err_chk_level.
bli_pthread_mutex_lock( &err_mutex );

// BEGIN CRITICAL SECTION
{
bli_err_chk_level = new_level;
}
// END CRITICAL SECTION

// Release the mutex protecting bli_err_chk_level.
bli_pthread_mutex_unlock( &err_mutex );
bli_err_chk_level = new_level;
}

bool bli_error_checking_is_enabled( void )
Expand Down

0 comments on commit c493032

Please sign in to comment.