-
Notifications
You must be signed in to change notification settings - Fork 790
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add #[pyo3(allow_threads)] to release the GIL in (async) functions #3610
Conversation
32c2050
to
77b24fa
Compare
CodSpeed Performance ReportMerging #3610 will not alter performanceComparing Summary
|
905f1eb
to
3bc7d67
Compare
@davidhewitt Here is the next one! I'm more explicit now 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent!
@davidhewitt I have a few minutes now and the merge queue failed. If we could hold off on merging I could have a look as well? |
@@ -16,7 +16,7 @@ error: expected argument from function definition `y` but got argument `x` | |||
13 | #[pyo3(signature = (x))] | |||
| ^ | |||
|
|||
error: expected one of: `name`, `pass_module`, `signature`, `text_signature`, `crate` | |||
error: expected one of: `allow_threads`, `name`, `pass_module`, `signature`, `text_signature`, `crate` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I would really want to have here is a test that checks that I cannot get py: Python<'_>
as my argument when using allow_threads
. This should not be possible to do as the GIL is released, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Ideally, it would also have a nice error message explaining why these two contradict each other.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would have a compilation error, because Python
is not Ungil
, but you're right, I will add that case in the UI tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As expected, there is a compilation error, but the output will not make @davidhewitt happy 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add an error message for Python
, but not really for things like &PyAny
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uff. Could you split the test into two, one for the GIL token and for the GIL bound reference?
For the GIL token: Since we are injecting this with our proc macro machinery, we should be able to do something about the error message for this one, shouldn't we?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add an error message for Python, but not really for things like &PyAny
Indeed, but I think we should do what we can. Sorry for putting more work on your table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not an issue, it will just wait a few hours but I will do the modification 😃
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Just for coverage, we should also duplicate the compiler error test for plain and async functions.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a check for Python<'_>
field for both allow_threads
and async functions. The error raised by Bound<'_, PyAny>
for these cases is still ugly, but as said earlier, I don't think it's possible to add a pretty error message, and this is already the case for Python::allow_threads
tests in ui/not_send.rs and ui/not_send2.rs.
Sure thing, I'll leave this for you to review and merge 👍 |
src/coroutine.rs
Outdated
self: Pin<&mut Self>, | ||
py: Python<'_>, | ||
waker: &Waker, | ||
allow_threads: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be me going off into the weeds, but we turning a compile time information into a runtime flag here and generating code to handle both cases even only one will ever be executed.
Hence, I wonder if we should use a const generic const ALLOW_THREADS: bool
on Coroutine
to ensure that only the required code is generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, that's what I have in my POC. I'd made the whole Coroutine
generic and boxed it in the #[pyclass]
.
But I decide to go for simplicity here. I can change that if you think the performance impact will be significant.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think simplicity is certainly good for the first iteration and this could definitely be a follow-up PR.
That said, I was wondering if we could hide this behind the boxed CoroutineFuture
trait to avoid generics all the way up, e.g. a marker wrapper like
struct AllowThreads<F>(F);
impl<F, T, E> CoroutineFuture for AllowThreads<F>
where
F: Future<Output = Result<T, E>> + Send,
T: IntoPy<PyObject> + Send,
E: Into<PyErr> + Send;
but I am not sure how well this can be tucked away to avoid complicating the instantiation too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
EDIT: I've kept allow_thread
as a function parameter instead of generic parameter in the constructor, because it will be more user-friendly when the constructor will be made public.
#4033 broke some code here, I will update the PR. |
40c0b9f
to
08948df
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM, but this is definitely a new area for me, so someone with more experience here should have a look as well.
We might additionally want a test to verify that for (async
) allow_threads
methods only &self
/&mut self
receivers are allowed and Bound
/PyRef
/PyRefMut
are rejected.
fn without_gil(_arg1: PyObject, _arg2: PyObject) { | ||
GIL_COUNT.with(|c| assert_eq!(c.get(), 0)); | ||
} | ||
Python::with_gil(|gil| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally call the Python token py
. I think we should stick with that for consistency here as well.
@@ -88,10 +125,10 @@ impl Coroutine { | |||
} else { | |||
self.waker = Some(Arc::new(AsyncioWaker::new())); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can split this here and do something like
let waker = self
.waker
.get_or_insert_with(|| Arc::new(AsyncioWaker::new()));
after the (possible) reset, to avoid the unwrap()
s below?
@@ -54,6 +54,6 @@ fn test_compile_errors() { | |||
t.compile_fail("tests/ui/invalid_pymodule_two_pymodule_init.rs"); | |||
#[cfg(feature = "experimental-async")] | |||
#[cfg(any(not(Py_LIMITED_API), Py_3_10))] // to avoid PyFunctionArgument for &str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given you've changed the argument types to i32
, I think this comment/feature gate is not accurate anymore.
{ | ||
ensure_spanned!( | ||
self.asyncness.is_none(), | ||
py_arg.ty.span() => "GIL token cannot be passed to async function" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 it looks like this already doesn't work so this makes the error message nicer. One suggestion given the GIL is potentially an outdated idea:
py_arg.ty.span() => "GIL token cannot be passed to async function" | |
py_arg.ty.span() => "`Python<'_>` token cannot be passed to async functions" |
); | ||
ensure_spanned!( | ||
self.allow_threads.is_none(), | ||
py_arg.ty.span() => "GIL cannot be held in function annotated with `allow_threads`" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here I think we want to just name the type:
py_arg.ty.span() => "GIL cannot be held in function annotated with `allow_threads`" | |
py_arg.ty.span() => "`Python<'_>` cannot be passed to a function annotated with `allow_threads`" |
quote! {{ | ||
#self_arg_decl | ||
#(let #arg_names = #args;)* | ||
py.allow_threads(|| function(#self_arg_name #(#arg_names),*)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a question in my head here about after #3646 which of the allow_threads
variants we will use here (cc @adamreichold). I think it would be possible to create a scoped TLS issue by setting up a TLS scope and then calling into Python. Similar thinking applies to the async case below. I guess we have to stick to the principle that the soundness is important and use the "safe but slow" form for this attribute. If users want to go unsafe
they can release the GIL themselves. This feels the right way to do it, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels the right way to do it, I think.
Yes, I think so too. Especially since we have already accumulated a significant amount of high risk changes for 0.22 and we can always optimize things in following releases.
@wyfo I may try to rebase and merge this tomorrow at the PyCon sprints. Any objections? |
If anyone has time to work on this, it could make a PR I need to write have rather smaller diffs 😁 |
@davidhewitt If we're good with including this in 0.23 I could go rebase this and fixup the last final remarks from above. |
I would love to have this, but I am worried about interactions with #3646. I think merging this before that PR is potentially a problem (might have breakage), and I am worried that PR is necessary but also makes async extremely complicated (because that PR basically disables all TLS with |
(My personal use case is non-async functions, so I'd be happy with this feature to land with initially just supporting normal functions. But I do see why this would be much more of a useful thing for async functions.) |
The interaction between async runtimes, threadlocals and #3646 remains whether or not this PR is merged. Is your concern that we'll have more fallout if we merge this and then that? |
@davidhewitt Why does the PR disable all TLS? |
Merging that PR means that allow_threads runs its closures on a threadpool, so it's executed on a different thread. It'd be surprising to people who are accessing threadlocals from it. |
Yes, I still feel like we should aim to do something like #3646 but the more I look at it the more extreme the fallout seems to be. Merging this seems to encourage use of |
As a user I would find the use of a thread pool very surprising and it would make (From my personal perspective the thread pool option seems very intrusive and in my own projects I would likely default to the unsound-if-you-put-Python-objects-into-TLS version that runs in current thread.) More practically for this PR: there could be two different attributes, one for each proposed allow_threads strategy? |
I don't think we can add an attribute for any |
Yes, though it's still in RFC and we might want to see how that works out: rust-lang/rfcs#3715 |
@davidhewitt I've just realized that this feature shouldn't be an issue regarding TLS smuggling. Indeed, functions annotated with |
You can have a stack that goes back and forth between Rust and Python, it isn't necessarily just Python → Rust: Python function → Rust function that stores something in TLS → Python function → Rust function annotated with |
Yes, it is exactly that kind of mixed language call stack which is my worry. I really want this feature, and I think the practical way forward is probably to use the I will be speaking at RustNation in a few weeks and will be bringing up this TLS pain as part of my talk, so maybe it will spur some bright minds in the Rust ecosystem to help us figure out new ways forward we haven't yet seen. |
Indeed, sorry about the noise. Anyway, let's recap for once why this PR exists. We don't care about synchronous function here, as I supported them only for the symmetry with asynchronous functions. The last one needed a special support, because to release the GIL at each poll, you need to use the unsafe but guaranteed assumption that the GIL is acquired when Therefore, I will simply close this PR. |
Ok, makes sense and awesome to hear you have some capacity! I must apologise in advance as I am going extremely slowly recently and unlikely to be back to full capacity until March. But I will do my best to review anything you push, async support got stuck for too long. Since we last collaborated on this I think I saw new |
It has indeed, so I could use that to save me my dirty TLS trick in the next PR. However, it would imply to radically bump the MSRV to 1.83, so I don't think we want to do that for now. However, I will let a TODO as a reminder for the next bump. |
Relates to #1632
Draft based on #3609