-
Notifications
You must be signed in to change notification settings - Fork 789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Towards soundness of PyByteArray::to_vec #4742
base: main
Are you sure you want to change the base?
Conversation
In free-threaded Python, to_vec needs to make sure to run inside a critical section so that no other Python thread is mutating the bytearray causing UB. See also PyO3#4736
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! We actually do have tests running for the free-threaded build, I would have been unhappy to declare support running without them! Similarly I have had virtualenv working just fine with 3.13t (haven't tried windows, though).
I think we could write a test which spawns a thread which does something to attempt to invalidate the data (maybe write to it using py.run
or PySequenceMethods::set_slice
) and confirm that the data read is the original data inserted, not the conflicting data (which should hopefully now block on either the GIL or the critical section depending on the build).
Co-authored-by: David Hewitt <[email protected]>
@davidhewitt I tried to write a test runing Not sure where to go from here. However, no matter how hard I tried, I couldn't get it to segfault. So maybe there's something more to it that I'm not aware of. |
I think that's not a suprise that it's hard to segfault; you'd have to do something like turn the uninitialized read into a cast on the bytes to create a structure in an invalid state. Nevertheless, invalid reads alone are a clear security issue. This problem clearly gets a lot worse in freethreaded Python. My knee jerk reaction is to make all bytearray methods in PyO3 unsafe. cc @ngoldbaum @colesbury is there any upstream opinion on how to handle bytearray objects on the free threaded build? |
I can't find any discussion about bytearray and free-threading in the CPython issue tracker, you may want to file an issue, especially if you can make a pure-python reproducer using the |
Thanks for that. I'm a bit unsure what the way forward here is. Without upstream also using critical sections, as you observe, adding the single section here seems a bit moot. I think we cannot change our API in a patch release so I think the likely path at the moment is that we make all the methods |
Seems right
…On Tue, Dec 3, 2024, 9:54 AM David Hewitt ***@***.***> wrote:
Thanks for that. I'm a bit unsure what the way forward here is. Without
upstream also using critical sections, as you observe, adding the single
section here seems a bit moot. I think we cannot change our API in a patch
release so I think the likely path at the moment is that we make all the
methods unsafe in PyO3 0.24?
—
Reply to this email directly, view it on GitHub
<#4742 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAAAGBEGMF7BEHP27MJKJMD2DXAZJAVCNFSM6AAAAABSWTQ2IKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDKMJUG44TKMBRHE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hopefully a future Python release will fix the thread safety issues you identified and we can at least make the free-threaded build have similar guarantees compared with the GIL-enabled build. |
I can see arguments for and against that. I'm slightly gravitating towards not doing it though. The way I see it is that PyO3's memory safety stands and falls with the soundness of the linked Python implementation. You have to assume that it's sound. If you don't, every PyO3 API would be unsafe (which I guess is Rust's standpoint, as every FFI call is unsafe). So in this particular case Just my 2 cents and you're much deeper into the world of this wonderful crate so ofc. it's up to you to decide 😇
I guess it is 🫤 Feel free to close the PR ⚰️ |
In free-threaded Python, to_vec needs to make sure to run inside a critical section so that no other Python thread is mutating the bytearray causing UB.
See also #4736
Unfortunately it seems I can't write proper tests for this as Python 3.13t is not yet part of the test matrix. I'm aware that support for testing with 3.13 and 3.13t is still in it's early stages and for instance virtualenv does not yet support it.