-
Notifications
You must be signed in to change notification settings - Fork 737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add interrupts for closeScope0
#13262
Add interrupts for closeScope0
#13262
Conversation
@tajila is this the idea? |
acde4b9
to
0a76487
Compare
Test is passing, previously e.printStackTrace() printed
|
closeScope0
closeScope0
When running TestHandshake I am getting |
I dont think its that simple, we need to make sure the thread we are attempting to interrupt has landed at a safe point before setting the exception. And by safe point I mean, not holding on to vmaccess. |
Take a look at how Specifically
|
0a76487
to
5e19150
Compare
I tried to ensure that the top frame of the scope accessor thread was interpreted but I am still getting Also, commented out is my other attempt at determining if the top frame is interpreted, but walkThread->literals was always null. |
Can you run the test with -Xint just to understand if this only occurs with jit frames. Perhaps run in a grinder. Assuming it fails with -Xint try setting Then repeat the test with -Xint again |
You can remove this code:
|
Also, set the haltFlags, etc. while the currentThread still has exclusive access |
17b8bf2
to
01dd3f7
Compare
@tajila This is consistently passing locally but not in ppc64le in Grinder (the exception thrown is still not caught). I think it may have to do with the exception being thrown in a native method? Is there a way to check this? |
The |
https://hyc-runtimes-jenkins.swg-devops.com/view/Test_grinder/job/Grinder/17787/console The exception is still not caught |
9a94802
to
ab95cf4
Compare
ProblemTestHandshake.java times out because scope accessor threads are not interrupted so they keep holding the scope, preventing close. This issue is intermittent. How to investigate?To diagnose the above problem, we need to know
Running the test with verbose output
The test is suspending, terminating and resuming threads. The test output may give the hint why the threads are not interrupted. Java and system coresJava and system cores can be used to investigate if the exception is correctly thrown and caught, and the state of the threads:
Trace points
|
caab3e5
to
b73e110
Compare
J9VMThread *scopeThread = current->thread; | ||
LinkedThreads *prev = NULL; | ||
|
||
if (NULL == scopeThread->currentException) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stopThrowable
will not be set if currentException
is not null
. On PPCLE, the JIT is capable of setting exceptions for hardware events. The same applies to S390 (zOS/zLinux). Refer to jitPPCHandler
:
openj9/runtime/compiler/runtime/SignalHandler.c
Lines 572 to 574 in a2a4729
switch (trapType) { | |
case TRAP_TYPE_NULL_CHECK: |
This will set the
currentException
and prevent stopThrowable
from being set in your code. Most probably, this is the cause for the intermittent TestHandshake
timeout failures on PPCLE. For fixing these failures, you will need to clear currentException
and always set stopThrowable
.
@gacholio Do we need to clear currentException
while setting stopThrowable
? If so, can we set currentException
to null
for clearing it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
currentException is only set from the current thread, and should only be read from the current thread. Re-using stopThrowable is going to cause problems interacting with stop(), which is deprecated but still hasn't been removed.
801cdc5
to
9240a4e
Compare
We're thinking of decompiling all frames of threads that are accessing the scope during close (should be rare in practice). Is this possible in normal jit code? fyi @gacholio |
I haven't seen a detailed spec, but from what I understand, any thread in the closing scope needs to be made to throw a ScopedAccessError. Before getting into the implementation problems, I see several issues with the spec itself, which does not say that it waits for the other threads to be kicked out of the scope before freeing the underlying memory. This implies that the throwing threads could never again access the scoped memory, which seems clearly untrue in these situations: Native method on top of stack The exception could be thrown upon return to the VM, so perhaps this isn't really a problem. Though errors are not supposed to be caught, they regularly are, particularly in the try/finally case (which is implemented as try/catch). For example:
If the closing thread frees the memory, the above access would obviously be illegal. |
Part of the problem with implementing this is that the JIT does not allow exceptions to be thrown from async checks (other than the implicit one on method entry), so it's possible that there could be a loop in the compiled code which uses the scoped memory which will never be broken by the exception throw. I do not recall the reasoning behind this - perhaps @0xdaryl or @fjeremic might recall. Also note that the stop() exception is also not thrown on return from JIT helpers, again for no reason I can think of (since other exceptions can be thrown from them). |
A third suggestion is that the exception threads be decompiled before the throw. This is done for breakpoints in FSD mode, but is currently never done in non-FSD mode. This may work, assuming that every GC point has an OSR decompile block. If we can't force it from the closing thread, it might be possible to inject an async event and do the decompile on the local thread before throwing the exception. This would require some modifications to the async message mechanism. |
OK, assuming the JIT doesn't want to change that, decompilation seems the only viable solution for compiled frames. This would require that asynccheck be an OSR point in all cases (which I'm pretty sure is already true). Unless there's a specific reason why the exception needs to be allocated in the closing thread, I suggest this be implemented like frame pop, with an interrupt bit in the publicFlags that triggers the allocation and throwing of the exception, decompiling the top frame in the compiled case. |
@tajila Is this still under discussion? The async exception cannot be implemented reliably. |
Discussed with @tajila - the |
Much like stop and frame pop, we will need to interrupt the target thread to break it out of wait/sleep/park. |
We'll need to check for this upon return from native (currently it does not check for frame pop or stop). This does not cover JIT direct JNI (not currently sure what to do about that). |
To clarify, we use |
Not quite. I think we need similar behaviour to We need to check if the top frame is a JIT frame, if so trigger decompilation (ie. We also need to check if the top frame is wait/sleep/park, if so we need to interrupt the thread. |
As mentioned above, we will need to do an async check on return from JNI native calls. The VM already has these checks in the sleep and wait INLs. I believe park is a JNI native. The JIT will also need to do this, and the inserted asynccheck must not be moved or elided. Is that going to be a problem? @fjeremic |
9240a4e
to
70bb220
Compare
`closeScope0` should interrupt scope accessor threads with a `Scope.ScopedAccessError`. Closes: eclipse-openj9#13256 Signed-off-by: Eric Yang <[email protected]>
70bb220
to
dd56e25
Compare
Tagging @0xdaryl and @jdmpapin for this one. I'm not sure if we can guarantee the optimizer won't touch such asyncchecks if they are inserted, unless we do it at tree lowering time or something. Also do we need an asynccheck after every JNI call? I'm unsure about the performance implications. |
@gacholio I no longer think this is needed. Since we are restricting this to threads that have methods with the |
That would certainly simplify things. I suspect we will still need to add extra checks in various places in the interpreter, but for the first pass, we can just assume the async check will be hit and handle the throw. For the JIT case, it may be sufficient to detect the compiled frame on TOS, add a decompilation for it, then perform the throw (the exception throw code already handles the decompile cases). I will need to think about how exactly to do this, so perhaps it's best to proceed ignoring the JIT case for now. |
Our test currently has a wait to ensure that the scope is closed as a thread is in a |
Ya those tests will need to be modified |
218351a
to
e3a046d
Compare
With the
so maybe I didn't set the exception correctly? |
Some time (perhaps next week) we should have a call about this. I don't believe the current changes are getting us to where we need to be. |
@EricYangIBM This PR was closed due to the deletion of |
I accidentally deleted my branches when trying to push a renamed branch |
Closing as per #13256 (comment) |
closeScope0
should interrupt scope accessor threads with aScope.ScopedAccessError
.Closes: #13256
Signed-off-by: Eric Yang [email protected]