Redispatch local tasks before thread parking in runBlocking #4312
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The original issue is in Ktor KTOR-7986. A coroutine deadlock occurs when
runBlocking
is used inside coroutines (unsurprisingly). While it is widely discouraged to callrunBlocking
from a suspend function, the deadlock in this case is not a typical thread deadlock (see technical details below).Unlike most other deadlocks, there is a potential solution to prevent this type of deadlock. I propose redispatching coroutine tasks from the local queue of the current worker thread before it is parked. I think having an imperfect but functional solution (redispatching local tasks) is better than one that strictly adheres to best practices but fails in real practice (do not use
runBlocking
inside coroutines).As described in the original issue the function that uses
runBlocking
is not a suspend function in the Ktor project. A coroutine context only appears in the test environment. While this is not good, as noted in the Ktor issue, such a scenario can arise in large projects with complex interactions, where a non-suspend function containingrunBlocking
is called inside a coroutine context in a certain non-standard case.Technical details from the Ktor issue:
The following is a simplified version of the Ktor code (used in
ActorSelectorManager
andSocketImpl
):Since an unintercepted
Continuation
is used, the selector coroutine resumes synchronously in the same stack frame and hence the same thread, behaving similarly to theUnconfined
dispatcher. Theyield()
function dispatches a new coroutine task to the local queue of the current worker thread because it belongs to the same dispatcher specified for the selector coroutine (Dispatchers.IO
). Next, the selector coroutine is suspended, and the thread is released. However, this thread is subsequently blocked byrunBlocking(CoroutineName("socket"))
. As a result, the previously created coroutine task gets stuck in the local queue of the blocked thread.