Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for async generator finalization #1564

Merged
merged 23 commits into from
Aug 5, 2020
Merged
Show file tree
Hide file tree
Changes from 18 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
26f1aea
Async generator hooks, simpler approach
oremanj Jun 10, 2020
ac3e46d
blacken
oremanj Jun 10, 2020
2947de7
Respond to review comments + add more tests
oremanj Jun 25, 2020
d198ed2
blacken
oremanj Jun 25, 2020
b720303
Merge remote-tracking branch 'origin/master' into asyncgenhooks-basic
oremanj Jun 25, 2020
45c2eee
flake8
oremanj Jun 25, 2020
2d81bb0
Fix mismerge with master
oremanj Jun 25, 2020
5a37f78
Work correctly in -Werror mode too
oremanj Jun 25, 2020
3acf34b
Handle asyncgens correctly when Trio is the guest of an asyncio host
oremanj Jun 25, 2020
80e8ab3
Fix 3.6
oremanj Jun 25, 2020
cfdb850
Make tests pass on pypy 7.2 which doesn't run firstiter hooks
oremanj Jun 25, 2020
153ce13
Hopefully resolve coverage issues
oremanj Jun 25, 2020
ebaf69c
blacken
oremanj Jun 25, 2020
5b2c544
Merge remote-tracking branch 'origin/master' into asyncgenhooks-basic
oremanj Jun 25, 2020
936ccdb
Add docs and newsfragment
oremanj Jun 25, 2020
7d0fcd9
Fix formatting
oremanj Jun 25, 2020
86a8b7d
Merge remote-tracking branch 'origin/master' into asyncgenhooks-basic
oremanj Jul 7, 2020
4776728
Fix flake8
oremanj Jul 7, 2020
5574b9e
Merge remote-tracking branch 'origin/master' into asyncgenhooks-basic
oremanj Jul 24, 2020
9868f25
Respond to review comments; split asyncgens logic into a separate file
oremanj Jul 24, 2020
2af6974
Fix mypy
oremanj Jul 24, 2020
33d168e
Merge remote-tracking branch 'origin/master' into asyncgenhooks-basic
oremanj Aug 5, 2020
0441952
Review responses
oremanj Aug 5, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file modified docs/source/conf.py
100644 → 100755
Empty file.
174 changes: 174 additions & 0 deletions docs/source/reference-core.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1465,6 +1465,180 @@ don't have any special access to Trio's internals.)
:members:


.. _async-generators:

Notes on async generators
-------------------------

Python 3.6 added support for *async generators*, which can use
``await``, ``async for``, and ``async with`` in between their ``yield``
statements. As you might expect, you use ``async for`` to iterate
over them. :pep:`525` has many more details if you want them.

For example, the following is a roundabout way to print
the numbers 0 through 9 with a 1-second delay before each one::

async def range_slowly(*args):
"""Like range(), but adds a 1-second sleep before each value."""
for value in range(*args):
await trio.sleep(1)
yield value

async def use_it():
async for value in range_slowly(10):
print(value)

trio.run(use_it)

Trio supports async generators, with some caveats described in this section.

Finalization
~~~~~~~~~~~~

If you iterate over an async generator in its entirety, like the
example above does, then the execution of the async generator will
occur completely in the context of the code that's iterating over it,
and there aren't too many surprises.

If you abandon a partially-completed async generator, though, such as
by ``break``\ing out of the iteration, things aren't so simple. The
async generator iterator object is still alive, waiting for you to
resume iterating it so it can produce more values. At some point,
Python will realize that you've dropped all references to the
iterator, and will call on Trio to help with executing any remaining
cleanup code inside the generator: ``finally`` blocks, ``__aexit__``
oremanj marked this conversation as resolved.
Show resolved Hide resolved
handlers, and so on.

So far, so good. Unfortunately, Python provides no guarantees about
*when* this happens. It could be as soon as you break out of the
``async for`` loop, or an arbitrary amount of time later. It could
even be after the entire Trio run has finished! Just about the only
guarantee is that it *won't* happen in the task that was using the
generator. That task will continue on with whatever else it's doing,
and the async generator cleanup will happen "sometime later,
somewhere else".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add something here about why this matters: that it means timeouts are gone, contextvars are gone, nurseries might be closed, etc., and your cleanup code may not be expecting that.

Maybe we should also mention somewhere that most of this applies to regular generators too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a sentence mentioning possible problems. I couldn't figure out how to also mention synchronous generators without getting too far afield, but feel free to take a shot yourself!


If you don't like that ambiguity, and you want to ensure that a
generator's ``finally`` blocks and ``__aexit__`` handlers execute as
soon as you're done using it, then you'll need to wrap your use of the
generator in something like `async_generator.aclosing()
<https://async-generator.readthedocs.io/en/latest/reference.html#context-managers>`__::

# Instead of this:
async for value in my_generator():
if value == 42:
break

# Do this:
async with aclosing(my_generator()) as aiter:
async for value in aiter:
if value == 42:
break

This is cumbersome, but Python unfortunately doesn't provide any other
reliable options. If you use ``aclosing()``, then
your generator's cleanup code executes in the same context as the
rest of its iterations, so timeouts, exceptions, and context
variables work like you'd expect.

If you don't use ``aclosing()``, then Trio will do
its best anyway, but you'll have to contend with the following semantics:

* The cleanup of the generator occurs in a cancelled context, i.e.,
all blocking calls executed during cleanup will raise `Cancelled`.
This is to compensate for the fact that any timeouts surrounding
the original use of the generator have been long since forgotten.

* The cleanup runs without access to any :ref:`context variables
<task-local-storage>` that may have been present when the generator
was originally being used.

* If the generator raises an exception during cleanup, then it's
printed to the ``trio.async_generator_errors`` logger and otherwise
ignored.

* If an async generator is still alive at the end of the whole
call to :func:`trio.run`, then it will be cleaned up after all
tasks have exited and before :func:`trio.run` returns.
Since the "system nursery" has already been closed at this point,
Trio isn't able to support any new calls to
:func:`trio.lowlevel.spawn_system_task`.

If you plan to run your code on PyPy to take advantage of its better
performance, you should be aware that PyPy is *far more likely* than
CPython to perform async generator cleanup at a time well after the
last use of the generator. (This is a consequence of the fact that
PyPy does not use reference counting to manage memory.) To help catch
issues like this, Trio will issue a `ResourceWarning` (ignored by
default, but enabled when running under ``python -X dev`` for example)
for each async generator that needs to be handled through the fallback
finalization path.

Cancel scopes and nurseries
~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. warning:: You may not write a ``yield`` statement that suspends an async generator
inside a `CancelScope` or `Nursery` that was entered within the generator.

That is, this is OK::

async def some_agen():
with trio.move_on_after(1):
await long_operation()
yield "first"
async with trio.open_nursery() as nursery:
nursery.start_soon(task1)
nursery.start_soon(task2)
yield "second"
...

But this is not::

async def some_agen():
with trio.move_on_after(1):
yield "first"
async with trio.open_nursery() as nursery:
yield "second"
...

Async generators decorated with ``@asynccontextmanager`` to serve as
the template for an async context manager are *not* subject to this
constraint, because ``@asynccontextmanager`` uses them in a limited
way that doesn't create problems.

Violating the rule described in this section will sometimes get you a
useful error message, but Trio is not able to detect all such cases,
so sometimes you'll get an unhelpful `TrioInternalError`. (And
sometimes it will seem to work, which is probably the worst outcome of
all, since then you might not notice the issue until you perform some
minor refactoring of the generator or the code that's iterating it, or
just get unlucky. There is a `proposed Python enhancement
<https://discuss.python.org/t/preventing-yield-inside-certain-context-managers/1091>`__
that would at least make it fail consistently.)

The reason for the restriction on cancel scopes has to do with the
difficulty of noticing when a generator gets suspended and
resumed. The cancel scopes inside the generator shouldn't affect code
running outside the generator, but Trio isn't involved in the process
of exiting and reentering the generator, so it would be hard pressed
to keep its cancellation plumbing in the correct state. Nurseries
use a cancel scope internally, so they have all the problems of cancel
scopes plus a number of problems of their own: for example, when
the generator is suspended, what should the background tasks do?
There's no good way to suspend them, but if they keep running and throw
an exception, where can that exception be reraised?

If you have an async generator that wants to ``yield`` from within a nursery
or cancel scope, your best bet is to refactor it to be a separate task
that communicates over memory channels.

For more discussion and some experimental partial workarounds, see
Trio issues `264 <https://github.com/python-trio/trio/issues/264>`__
(especially `this comment
<https://github.com/python-trio/trio/issues/264#issuecomment-418989328>`__)
and `638 <https://github.com/python-trio/trio/issues/638>`__.


.. _threads:

Threads (if you must)
Expand Down
6 changes: 6 additions & 0 deletions newsfragments/265.headline.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
Trio now supports automatic :ref:`async generator finalization
<async-generators>`, so more async generators will work even if you
don't wrap them in ``async with async_generator.aclosing():``
blocks. Please see the documentation for important caveats; in
particular, yielding within a nursery or cancel scope remains
unsupported.
14 changes: 9 additions & 5 deletions trio/_core/_entry_queue.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,15 @@ def run_cb(job):
async def kill_everything(exc):
raise exc

_core.spawn_system_task(kill_everything, exc)
try:
_core.spawn_system_task(kill_everything, exc)
except RuntimeError:
# We're quite late in the shutdown process and the
# system nursery is already closed.
# TODO(2020-06): this is a gross hack and should
# be fixed soon when we address #1607.
_core.current_task().parent_nursery.start_soon(kill_everything, exc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a gross hack and I don't like it. (And I don't like the subtle coupling with the finalize_remaining_asyncgens code, where it has to use a shield because of this.)

But maybe there's no better solution for right now, and we should merge this as is, and then address #1607 in a separate PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is very much a gross hack, but I think it's the best solution for right now. I'll leave a comment.


return True

# This has to be carefully written to be safe in the face of new items
Expand Down Expand Up @@ -102,10 +110,6 @@ def close(self):
def size(self):
return len(self.queue) + len(self.idempotent_queue)

def spawn(self):
name = "<TrioToken.run_sync_soon task>"
_core.spawn_system_task(self.task, name=name)

def run_sync_soon(self, sync_fn, *args, idempotent=False):
with self.lock:
if self.done:
Expand Down
Loading