Skip to content

Commit

Permalink
GIL functions for genuine multi-threading (#535)
Browse files Browse the repository at this point in the history
* slightly more thread safe gc

* use Channel not Vector and make disable/enable a no-op

* document GCHook

* cannot lock channels on julia 1.6

* revert to using a vector for the queue

* restore test script

* combine queue into a single item

* prefer Fix2 over anonymous function

* update docs

* test multithreaded

* test gc from python

* add gc tests

* fix test

* add deprecation warnings

* safer locking (plus explanatory comments)

* ref of weakref

* SpinLock -> ReentrantLock

* SpinLock -> ReentrantLock

* add PythonCall.GIL

* add tests for PythonCall.GIL

* add GIL to release notes

* add GIL release tests from Python

* typo: testset -> testitem

* delete redundant test

* remove out of date comment

* comment erroneous test

* re-enable commented test

* adds AnyValue._jl_call_nogil

* add RawValue._jl_call_nogil

* add docstrings

* add warnings about the GIL to docstrings

* add reference docstrings

* remove big pycall comparison and move pycall help to faq

* document new threading features

* update release notes

* clarification

* rename GIL.release to GIL.unlock and use lock/unlock terminology consistently

---------

Co-authored-by: Christopher Doris <github.com/cjdoris>
  • Loading branch information
cjdoris authored Aug 7, 2024
1 parent bcd2bbb commit 4a1ee78
Show file tree
Hide file tree
Showing 17 changed files with 416 additions and 105 deletions.
5 changes: 2 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,8 @@ In this example we use the Python module JuliaCall from an IPython notebook to t

## What about PyCall?

The existing package [PyCall](https://github.com/JuliaPy/PyCall.jl) is another similar interface to Python. Here we note some key differences, but a more detailed comparison is in the documentation.
The existing package [PyCall](https://github.com/JuliaPy/PyCall.jl) is another similar interface to Python. Here we note some key differences:.
- PythonCall supports a wider range of conversions between Julia and Python, and the conversion mechanism is extensible.
- PythonCall by default never copies mutable objects when converting, but instead directly wraps the mutable object. This means that modifying the converted object modifies the original, and conversion is faster.
- PythonCall does not usually automatically convert results to Julia values, but leaves them as Python objects. This makes it easier to do Pythonic things with these objects (e.g. accessing methods) and is type-stable.
- PythonCall installs dependencies into a separate Conda environment for each Julia project. This means each Julia project can have an isolated set of Python dependencies.
- PythonCall supports Julia 1.6.1+ and Python 3.8+ whereas PyCall supports Julia 0.7+ and Python 2.7+.
- PythonCall installs dependencies into a separate Conda environment for each Julia project using [CondaPkg](https://github.com/JuliaPy/CondaPkg.jl). This means each Julia project can have an isolated set of Python dependencies.
1 change: 0 additions & 1 deletion docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@ makedocs(
],
"compat.md",
"faq.md",
"pycall.md",
"releasenotes.md",
],
)
Expand Down
27 changes: 14 additions & 13 deletions docs/src/faq.md
Original file line number Diff line number Diff line change
@@ -1,21 +1,22 @@
# FAQ & Troubleshooting

## Is PythonCall/JuliaCall thread safe?
## Can I use PythonCall and PyCall together?

Yes, you can use both PyCall and PythonCall in the same Julia session. This is platform-dependent:
- On most systems the Python interpreter used by PythonCall and PyCall must be the same (see below).
- On Windows it appears to be possible for PythonCall and PyCall to use different interpreters.

To force PythonCall to use the same Python interpreter as PyCall, set the environment variable [`JULIA_PYTHONCALL_EXE`](@ref pythoncall-config) to `"@PyCall"`. Note that this will opt out of automatic dependency management using CondaPkg.

No.
Alternatively, to force PyCall to use the same interpreter as PythonCall, set the environment variable `PYTHON` to [`PythonCall.python_executable_path()`](@ref) and then `Pkg.build("PyCall")`. You will need to do this each time you change project, because PythonCall by default uses a different Python for each project.

## Is PythonCall/JuliaCall thread safe?

However it is safe to use PythonCall with Julia with multiple threads, provided you only
call Python code from the first thread. (Before v0.9.22, tricks such as disabling the
garbage collector were required.)
Yes, as of v0.9.22, provided you handle the GIL correctly. See the guides for
[PythonCall](@ref jl-multi-threading) and [JuliaCall](@ref py-multi-threading).

From Python, to use JuliaCall with multiple threads you probably need to set
[`PYTHON_JULIACALL_HANDLE_SIGNALS=yes`](@ref julia-config) before importing JuliaCall.
This is because Julia intentionally causes segmentation faults as part of the GC
safepoint mechanism. If unhandled, these segfaults will result in termination of the
process. This is equivalent to starting julia with `julia --handle-signals=yes`, the
default behavior in Julia. See discussion
[here](https://github.com/JuliaPy/PythonCall.jl/issues/219#issuecomment-1605087024)
for more information.
Before, tricks such as disabling the garbage collector were required. See the
[old docs](https://juliapy.github.io/PythonCall.jl/v0.9.21/faq/#Is-PythonCall/JuliaCall-thread-safe?).

Related issues:
[#201](https://github.com/JuliaPy/PythonCall.jl/issues/201),
Expand Down
8 changes: 5 additions & 3 deletions docs/src/juliacall-reference.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# JuliaCall API Reference
# [JuliaCall API Reference](@id jl-reference)

## Constants

Expand Down Expand Up @@ -93,8 +93,9 @@ replaced with `!!`.
###### Members
- `_jl_raw()`: Convert to a [`RawValue`](#juliacall.RawValue). (See also [`pyjlraw`](@ref).)
- `_jl_display()`: Display the object using Julia's display mechanism.
- `_jl_help()`: Display help for the object.
- `_jl_display(mime=None)`: Display the object using Julia's display mechanism.
- `_jl_help(mime=None)`: Display help for the object.
- `_jl_call_nogil(*args, **kwargs)`: Call this with the GIL disabled.
`````

`````@customdoc
Expand Down Expand Up @@ -217,4 +218,5 @@ single tuple, it will need to be wrapped in another tuple.
###### Members
- `_jl_any()`: Convert to a [`AnyValue`](#juliacall.AnyValue) (or subclass). (See also
[`pyjl`](@ref).)
- `_jl_call_nogil(*args, **kwargs)`: Call this with the GIL disabled.
`````
76 changes: 76 additions & 0 deletions docs/src/juliacall.md
Original file line number Diff line number Diff line change
Expand Up @@ -124,3 +124,79 @@ be configured in two ways:
| `-X juliacall-threads=<N\|auto>` | `PYTHON_JULIACALL_THREADS=<N\|auto>` | Launch N threads. |
| `-X juliacall-warn-overwrite=<yes\|no>` | `PYTHON_JULIACALL_WARN_OVERWRITE=<yes\|no>` | Enable or disable method overwrite warnings. |
| `-X juliacall-autoload-ipython-extension=<yes\|no>` | `PYTHON_JULIACALL_AUTOLOAD_IPYTHON_EXTENSION=<yes\|no>` | Enable or disable IPython extension autoloading. |

## [Multi-threading](@id py-multi-threading)

From v0.9.22, JuliaCall supports multi-threading in Julia and/or Python, with some
caveats.

Most importantly, you can only call Python code while Python's
[Global Interpreter Lock (GIL)](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)
is locked by the current thread. You can use JuliaCall from any Python thread, and the GIL
will be locked whenever any JuliaCall function is used. However, to leverage the benefits
of multi-threading, you can unlock the GIL while executing any Julia code that does not
interact with Python.

The simplest way to do this is using the `_jl_call_nogil` method on Julia functions to
call the function with the GIL unlocked.

```python
from concurrent.futures import ThreadPoolExecutor, wait
from juliacall import Main as jl
pool = ThreadPoolExecutor(4)
fs = [pool.submit(jl.Libc.systemsleep._jl_call_nogil, 5) for _ in range(4)]
wait(fs)
```

In the above example, we call `Libc.systemsleep(5)` on four threads. Because we
called it with `_jl_call_nogil`, the GIL was unlocked, allowing the threads to run in
parallel, taking about 5 seconds in total.

If we did not use `_jl_call_nogil` (i.e. if we did `pool.submit(jl.Libc.systemsleep, 5)`)
then the above code will take 20 seconds because the sleeps run one after another.

It is very important that any function called with `_jl_call_nogil` does not interact
with Python at all unless it re-locks the GIL first, such as by using
[PythonCall.GIL.@lock](@ref).

You can also use [multi-threading from Julia](@ref jl-multi-threading).

### Caveat: Julia's task scheduler

If you try the above example with a Julia function that yields to the task scheduler,
such as `sleep` instead of `Libc.systemsleep`, then you will likely experience a hang.

In this case, you need to yield back to Julia's scheduler periodically to allow the task
to continue. You can use the following pattern instead of `wait(fs)`:
```python
jl_yield = getattr(jl, "yield")
while True:
# yield to Julia's task scheduler
jl_yield()
# wait for up to 0.1 seconds for the threads to finish
state = wait(fs, timeout=0.1)
# if they finished then stop otherwise try again
if not state.not_done:
break
```

Set the `timeout` parameter smaller to let Julia's scheduler cycle more frequently.

Future versions of JuliaCall may provide tooling to make this simpler.

### [Caveat: Signal handling](@id py-multi-threading-signal-handling)

We recommend setting [`PYTHON_JULIACALL_HANDLE_SIGNALS=yes`](@ref julia-config)
before importing JuliaCall with multiple threads.

This is because Julia intentionally causes segmentation faults as part of the GC
safepoint mechanism. If unhandled, these segfaults will result in termination of the
process. See discussion
[here](https://github.com/JuliaPy/PythonCall.jl/issues/219#issuecomment-1605087024)
for more information.

Note however that this interferes with Python's own signal handling, so for example
Ctrl-C will not raise `KeyboardInterrupt`.

Future versions of JuliaCall may make this the default behaviour when using multiple
threads.
75 changes: 0 additions & 75 deletions docs/src/pycall.md

This file was deleted.

13 changes: 13 additions & 0 deletions docs/src/pythoncall-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -218,6 +218,19 @@ Py(x::MyType) = x.py
@pyconst
```

## Multi-threading

These functions are not exported. They support multi-threading of Python and/or Julia.
See also [`juliacall.AnyValue._jl_call_nogil`](@ref julia-wrappers).

```@docs
PythonCall.GIL.lock
PythonCall.GIL.@lock
PythonCall.GIL.unlock
PythonCall.GIL.@unlock
PythonCall.GC.gc
```

## The Python interpreter

These functions are not exported. They give information about which Python interpreter is
Expand Down
40 changes: 40 additions & 0 deletions docs/src/pythoncall.md
Original file line number Diff line number Diff line change
Expand Up @@ -362,3 +362,43 @@ end

If your package depends on some Python packages, you must generate a `CondaPkg.toml` file.
See [Installing Python packages](@ref python-deps).

## [Multi-threading](@id jl-multi-threading)

From v0.9.22, PythonCall supports multi-threading in Julia and/or Python, with some
caveats.

Most importantly, you can only call Python code while Python's
[Global Interpreter Lock (GIL)](https://docs.python.org/3/glossary.html#term-global-interpreter-lock)
is locked by the current thread. Ordinarily, the GIL is locked by the main thread in Julia,
so if you want to run Python code on any other thread, you must unlock the GIL from the
main thread and then re-lock it while running any Python code on other threads.

This is made possible by the macros [`PythonCall.GIL.@unlock`](@ref) and
[`PythonCall.GIL.@lock`](@ref) or the functions [`PythonCall.GIL.unlock`](@ref) and
[`PythonCall.GIL.lock`](@ref) with this pattern:

```julia
PythonCall.GIL.@unlock Threads.@threads for i in 1:4
PythonCall.GIL.@lock pyimport("time").sleep(5)
end
```

In the above example, we call `time.sleep(5)` four times in parallel. If Julia was
started with at least four threads (`julia -t4`) then the above code will take about
5 seconds.

Both `@unlock` and `@lock` are important. If the GIL were not unlocked, then a deadlock
would occur when attempting to lock the already-locked GIL from the threads. If the GIL
were not re-locked, then Python would crash when interacting with it.

You can also use [multi-threading from Python](@ref py-multi-threading).

### Caveat: Garbage collection

If Julia's GC collects any Python objects from a thread where the GIL is not currently
locked, then those Python objects will not immediately be deleted. Instead they will be
queued to be deleted in a later GC pass.

If you find you have many Python objects not being deleted, you can call
[`PythonCall.GC.gc()`](@ref) or `GC.gc()` while the GIL is locked to clear the queue.
5 changes: 5 additions & 0 deletions docs/src/releasenotes.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,11 @@
* `GC.disable()` and `GC.enable()` are now a no-op and deprecated since they are no
longer required for thread-safety. These will be removed in v1.
* Adds `GC.gc()`.
* Adds module `GIL` with `lock()`, `unlock()`, `@lock` and `@unlock` for handling the
Python Global Interpreter Lock. In combination with the above improvements, these
allow Julia and Python to co-operate on multiple threads.
* Adds method `_jl_call_nogil` to `juliacall.AnyValue` and `juliacall.RawValue` to call
Julia functions with the GIL unlocked.

## 0.9.21 (2024-07-20)
* `Serialization.serialize` can use `dill` instead of `pickle` by setting the env var `JULIA_PYTHONCALL_PICKLE=dill`.
Expand Down
Loading

0 comments on commit 4a1ee78

Please sign in to comment.