lsan race with thread initialization and fast_unwind_on_malloc=0 #1836

peff · 2025-01-02T02:59:12Z

In the Git project we've seen some false positives running our test suite with LSan. They seem to be caused by racing between a thread being created and another one exiting the process (and thus triggering a leak check). Here's a simplified reproduction program:

#include <stdlib.h>
#include <pthread.h>

static void *run(void *data)
{
        return NULL;
}

int main(void)
{
        pthread_t threads[256];

        for (int i = 0; i < sizeof(threads)/sizeof(*threads); i++)
                pthread_create(&threads[i], NULL, run, NULL);
        exit(0);
}

If I compile with LSan (this is using gcc 14.2.0, but I get the same result with clang 19.1.6):

gcc -fsanitize=leak foo.c

and then run it a few times (the race usually triggers within a handful of runs), with fast_unwind_on_malloc=0:

while LSAN_OPTIONS=fast_unwind_on_malloc=0 ./a.out
do
  : nothing
done

then I get this leak reported:

=================================================================
==3861584==ERROR: LeakSanitizer: detected memory leaks

Direct leak of 32 byte(s) in 1 object(s) allocated from:
    #0 0x7f52d6c14556 in realloc ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:98
    #1 0x7f52d6a9d2c1 in __pthread_getattr_np nptl/pthread_getattr_np.c:180
    #2 0x7f52d6c2500d in __sanitizer::GetThreadStackTopAndBottom(bool, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:150
    #3 0x7f52d6c25187 in __sanitizer::GetThreadStackAndTls(bool, unsigned long*, unsigned long*, unsigned long*, unsigned long*) ../../../../src/libsanitizer/sanitizer_common/sanitizer_linux_libcdep.cpp:614
    #4 0x7f52d6c17d18 in __lsan::ThreadStart(unsigned int, unsigned long long, __sanitizer::ThreadType) ../../../../src/libsanitizer/lsan/lsan_posix.cpp:53
    #5 0x7f52d6c143a9 in ThreadStartFunc<false> ../../../../src/libsanitizer/lsan/lsan_interceptors.cpp:431
    #6 0x7f52d6a9bf51 in start_thread nptl/pthread_create.c:447
    #7 0x7f52d6b1a677 in __clone3 ../sysdeps/unix/sysv/linux/x86_64/clone3.S:78

SUMMARY: LeakSanitizer: 32 byte(s) leaked in 1 allocation(s).

It looks like setting up the stack requires pthread_getattr_np, which allocates. Presumably the allocation is properly cleaned up, but if the original thread calls exit at the wrong time, we never get a chance to do so. And since we haven't yet set up the thread's stack, it's not considered reachable.

Curiously, without fast_unwind_on_malloc=0, I haven't been able to trigger the problem. I don't know if that's because it is just less likely to trigger the race somehow, or if it uses a different code path for setting up the thread stack.

Allocating in pthread_getattr_np is going to be system dependent. In this case it's a Debian Linux/glibc system.

The text was updated successfully, but these errors were encountered:

peff · 2025-01-02T03:20:29Z

You can avoid the race if you use a barrier to stop all threads until they are all finished initializing, like:

#include <stdlib.h>
#include <pthread.h>

static pthread_barrier_t barrier;

static void *run(void *data)
{
        pthread_barrier_wait(&barrier);
        return NULL;
}

int main(void)
{
        pthread_t threads[256];

        pthread_barrier_init(&barrier, NULL, sizeof(threads)/sizeof(*threads)+1);
        for (int i = 0; i < sizeof(threads)/sizeof(*threads); i++)
                pthread_create(&threads[i], NULL, run, NULL);
        pthread_barrier_wait(&barrier);
        exit(0);
}

That's effectively stopping the world during initialization so that nobody can call exit. But it's not a very flexible solution.

My guess (as a complete layman who is not at all familiar with the sanitizer code) is that the pthread_create interceptor could probably take a lock/mutex while setting up the thread, and likewise the atexit sanitizer run would wait on the same lock.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lsan race with thread initialization and fast_unwind_on_malloc=0 #1836

lsan race with thread initialization and fast_unwind_on_malloc=0 #1836

peff commented Jan 2, 2025

peff commented Jan 2, 2025

lsan race with thread initialization and fast_unwind_on_malloc=0 #1836

lsan race with thread initialization and fast_unwind_on_malloc=0 #1836

Comments

peff commented Jan 2, 2025

peff commented Jan 2, 2025