Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide runtime builds with sanitizer support #4819

Open
JohanEngelen opened this issue Jan 12, 2025 · 9 comments
Open

Provide runtime builds with sanitizer support #4819

JohanEngelen opened this issue Jan 12, 2025 · 9 comments
Labels
sanitizers ASan, libFuzzer, ...

Comments

@JohanEngelen
Copy link
Member

JohanEngelen commented Jan 12, 2025

To use sanitizers like AddressSanitizer, you need druntime support if the user's program uses fibers or if ASan's fakestack is enabled (Use after return error detection enabled).

There is some overhead associated with sanitizer support in druntime, which is why it is disabled by default. The overhead involves memory lookup (cache impacts) for every GC scan and fiber switch. I have not quantified this overhead. Perhaps it is small because a GC scan is already a huge operation with a large cache load. Moreover, CheckFiberMigration is set to true when sanitizer support is enabled; I think the consensus is that migrating fibers is bad, but it is allowed in druntime on e.g. Linux.

To get druntime support, you have to build druntime with -d-version=SupportSanitizers. Building LDC with cmake variable RT_SUPPORT_SANITIZERS set will do just that.
It is quite cumbersome to have to rebuild the druntime (and link with it) to get sanitizer support. We currently don't have LDC packages that have sanitizer support built into druntime, making it much harder to do CI testing with sanitizers enabled (CI would have to rebuild druntime).
I want to make that much simpler, hopefully so simple that just -fsanitize=address is enough (automatically checks whether druntime with sanitizer support is available and links with that).

I see a few options to get there, and need your help to decide which direction to go.

Option A: default build druntime with sanitizer support.

  • Pro: easy to do, no extra files introduced. ASan possible with optimized programs.
  • Con: all programs will now have extra runtime overhead.

Option B: provide separate druntime libraries with sanitizer support
This is similar to how we have optimized and debug druntime library builds. This option means adding 4 files: libdruntime-ldc-supportsan.a, libdruntime-ldc-debug-supportsan.a, libdruntime-ldc-debug-supportsan-shared.dylib, libdruntime-ldc-supportsan-shared.dylib. We could make LDC automatically check for a supportsan library when -fsanitize=address is used.

  • Pro: no impact on existing programs. ASan possible with optimized programs.
  • Con: extra files introduced, LDC package becomes larger

Option C: Support only for debug builds, default enabled in debug druntime lib.

  • Pro: overhead acceptable for debug builds, checkfibermigration acceptable? Easy like option A.
  • Con: not enabled for optimized builds means that bugs may be missed (because checks are optimized out), i.e. ASan not possible with optimized programs.

(Edit) Option D: Support sanitizers in debug build (default on), and separate optimized druntime library with optimization on.
This option means adding 2 files: libdruntime-ldc-supportsan.a and libdruntime-ldc-supportsan-shared.dylib.

  • Pro: no impact on optimized programs. Debug builds overhead acceptable, checkfibermigration acceptable? ASan possible with optimized programs.
  • Con: extra files, LDC package becomes larger (but less than option B)

Other options?

Note: for C++/clang there is no problem (no separate runtime binary needed besides the ASan library that overrides several std C lib functions), because the relevant parts of the standard/runtime library are templated and hence always recompiled with/without sanitizer support (e.g. for container annotation, https://github.com/llvm/llvm-project/blob/d080f78772acf9de4961b89062c02fdd5f82186a/libcxx/include/__debug_utils/sanitizers.h#L83-L100).

@JohanEngelen JohanEngelen added the sanitizers ASan, libFuzzer, ... label Jan 12, 2025
@JohanEngelen
Copy link
Member Author

@kinke @ljmf00 what do you think?

@kinke
Copy link
Member

kinke commented Jan 12, 2025

I'd prefer not to add further library sets if possible. Option C is doable easily and doesn't have any downsides (for existing -link-defaultlib-debug usages) I think. Option A if the (quantized) overhead is negligible.

@JohanEngelen
Copy link
Member Author

The main downside of C is that programs with ASan will run very slow and that you don't catch bugs that your actual program (optimized) will have. For that reason, I don't think C is a real viable option...

@ljmf00-wekaio
Copy link

Is shipping two libraries with and without sanitisers support a bit bloat? I mean, I think it would be easier to do so, and when the compiler has -fsanitize=<x,y,z> it would link against them, but I understand the drawbacks.
Another alternative to avoid bloat, maybe just have a diff of the symbols required for sanitizers support, strip the rest and patch the original druntime when -fsanitize= is passed. This alternative of diffing will only work for PIC code tho, right?


The option A has some performance problems that I think can be improved, the optional link lookup isn't memoized well, AFAIK, but I would still not prefer it. Release need to avoid having unnecessary stuff, in my view. For weka doesn't make a difference, so A is probably viable for Weka, because we build druntime ourselves anyway, but I'm thinking for the general case ofc.

I think to have a seamless experience, option C is not viable, so I would go for option B or a diff/improved version of option B.

@ljmf00-wekaio
Copy link

Regarding CheckFiberMigration, I would say druntime should document that it should be "undefined behaviour" if that happens. I assume this is for migrating fiber to run on another OS thread's stack, right? Perhaps druntime having runtime properties so user can query them to know if its allowed/safe or not.

@ljmf00-wekaio
Copy link

Another option: using ifunc support for dynamically providing two symbols in one build and select them at runtime. Does LLVM have similar support? See https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-ifunc-function-attribute .

@JohanEngelen
Copy link
Member Author

Another option: using ifunc support for dynamically providing two symbols in one build and select them at runtime. Does LLVM have similar support? See https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-ifunc-function-attribute .

This is effectively what druntime does now. ifunc is basically a function pointer that is initialized during startup, which is also how asan support is implemented (initialized upon first call).
So it is memoized, how much better do you think we can implemented that?

@ljmf00-wekaio
Copy link

Another option: using ifunc support for dynamically providing two symbols in one build and select them at runtime. Does LLVM have similar support? See gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-ifunc-function-attribute .

This is effectively what druntime does now. ifunc is basically a function pointer that is initialized during startup, which is also how asan support is implemented (initialized upon first call). So it is memoized, how much better do you think we can implemented that?

The memoizing I'm talking about is mainly on non-happy flow of getOptionalSanitizerFunc logic. If the linked symbol is null and the dlopen/dlsym fallback fails, that function will still always try the fallback the second/third/N times, instead of memoizing the failure and proceed without it by caching the last fallback result. On the most normal situations, where ASAN is not linked, dlopen/dsym will only need to run once the first call of that ASAN function, then, its branch predicted, should be pretty negligible performance impact (we need to measure it, to make sure).

@JohanEngelen
Copy link
Member Author

JohanEngelen commented Jan 19, 2025

This is effectively what druntime does now. ifunc is basically a function pointer that is initialized during startup, which is also how asan support is implemented (initialized upon first call). So it is memoized, how much better do you think we can implemented that?

The memoizing I'm talking about is mainly on non-happy flow of getOptionalSanitizerFunc logic. If the linked symbol is null and the dlopen/dlsym fallback fails, that function will still always try the fallback the second/third/N times, instead of memoizing the failure and proceed without it by caching the last fallback result. On the most normal situations, where ASAN is not linked, dlopen/dsym will only need to run once the first call of that ASAN function, then, its branch predicted, should be pretty negligible performance impact (we need to measure it, to make sure).

dlsym is only called once. If it is null, then the fptr is set to 1 (FUNC_NOT_FOUND), i.e. caching the null result as a non-null obviously bad pointer. That is then detected and converted back to null at the end of getOptionalSanitizerFunc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sanitizers ASan, libFuzzer, ...
Projects
None yet
Development

No branches or pull requests

3 participants