Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(kad): use oneshots and async-await for outbound streams #4901

Merged
merged 11 commits into from
Nov 22, 2023

Conversation

thomaseizinger
Copy link
Contributor

@thomaseizinger thomaseizinger commented Nov 20, 2023

Description

This refactoring addresses several aspects of the current handler implementation:

  • Remove the manual state machine for outbound streams in favor of using async-await.
  • Use oneshots to track the number of requested outbound streams
  • Use futures_bounded::FuturesMap to track the execution of a stream, thus applying a timeout to the entire request.

Resolves: #3130.
Related: #3268.
Related: #4510.

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this is great!

image

🎉

I don't find the pattern of request-stream-and-pass-it-through-oneshot-to-the-already-started-future intuitive. Given the use of a oneshot, I would expect some level of concurrency, whereas this is strictly sequential.

Also, long term, I would like to have backpressure between NetworkBehaviour and ConnectionHandler. The ConnectionHandler should only accept a new request from the NetworkBehaviour, when the previous request has a new channel and is being sent. The current pattern does not suite this. Though one could argue that this is easy to refactor, once we introduce backpressure.

I see the problem, and appreciate the work here and on #4834. I suggest we discuss this in tomorrow's open maintainers call. In case we can not come up with an alternative, I suggest we go with your proposal here.

protocols/kad/src/handler.rs Outdated Show resolved Hide resolved
protocols/kad/src/handler.rs Outdated Show resolved Hide resolved
@thomaseizinger
Copy link
Contributor Author

Also, long term, I would like to have backpressure between NetworkBehaviour and ConnectionHandler. The ConnectionHandler should only accept a new request from the NetworkBehaviour, when the previous request has a new channel and is being sent. The current pattern does not suite this.

Why not? At any point, we know how many streams are currently being opened by checking how many Senders we have. A backpressure-aware API can make use of this information to slow down incoming requests.

I don't find the pattern of request-stream-and-pass-it-through-oneshot-to-the-already-started-future intuitive. Given the use of a oneshot, I would expect some level of concurrency, whereas this is strictly sequential.

I am not sure I entirely follow your reasoning here. To me, a oneshot is useful if I want to suspend a Future on some missing state at the time of constructing it. It is simply a level of indirection.

The oneshot is only necessary because our current APIs don't support expressing the sequential nature of opening a stream and doing something with it using async-await. In an ideal world, a ConnectionHandler would have a way of obtaining a Future that provides it with a Stream such that it can directly compose the entire protocol onto it.

In fact, the more I think about this, the more obvious and elegant it appears to me because it simplifies so much of the state management in ConnectionHandlers.

@thomaseizinger
Copy link
Contributor Author

In case we can not come up with an alternative, I suggest we go with your proposal here.

I am open to alternatives. I couldn't think of any that we can ship as easily as this because many APIs would have to change drastically.

@thomaseizinger
Copy link
Contributor Author

@mxinden I've renamed the function. Note that, due to our magic 32, we still have a potentially unbounded buffer here because messages can pile up in pending_messages.

I'd suggest deferring this to a later point though. For now, I'd like to push the use of oneshots for outbound streams forward. Together with a tutorial on how to write your own ConnectionHandler, I see that as the solution to finialising #3268.

Copy link
Member

@mxinden mxinden left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ready to merge apart from the ongoing discussion above.

protocols/kad/src/handler.rs Outdated Show resolved Hide resolved
@mergify mergify bot merged commit add1ff6 into master Nov 22, 2023
71 checks passed
@mergify mergify bot deleted the refactor/kad-use-oneshots branch November 22, 2023 23:36
mergify bot pushed a commit that referenced this pull request Dec 5, 2023
We mistakenly assumed that `QueryId`s are unique in that, only a single request will be emitted per `QueryId`. This is wrong. A bootstrap for example will issue multiple requests as part of the same `QueryId`. Thus, we cannot use the `QueryId` as a key for the `FuturesMap`. Instead, we use a `FuturesTupleSet` to associate the `QueryId` with the in-flight request.

Related: #4901.
Resolves: #4948.

Pull-Request: #4971.
AgeManning pushed a commit to sigp/rust-libp2p that referenced this pull request Dec 13, 2023
* ci: unset `RUSTFLAGS` value in semver job

Don't fail semver-checking if a dependency version has warnings, such as deprecation notices.

Related: libp2p#4932 (comment).
Related: obi1kenobi/cargo-semver-checks#589.

Pull-Request: libp2p#4942.

* deps(webrtc): bump alpha versions

Bumps versions of `libp2p-webrtc` and `libp2p-webrtc-websys` up one minor version.

Fixes: libp2p#4953.

Pull-Request: libp2p#4959.

* feat(request-response): derive `PartialOrd`,`Ord` for `{Out,In}RequestId`

Pull-Request: libp2p#4956.

* refactor(connection-limits): make `check_limit` a free-function

Pull-Request: libp2p#4958.

* chore(webrtc-utils): bump version to allow for new release

We didn't bump this crate's version despite it depending on `libp2p_noise`. As such, we can't release `libp2p-webrtc-websys` at the moment because it needs a new release of this crate.

Pull-Request: libp2p#4968.

* feat(webrtc-websys): hide `libp2p_noise` from the public API

Currently, `libp2p-webrtc-websys` exposes the `libp2p_noise` dependency in its public API. It should really be a private dependency of the crate. By wrapping it in a new-type, we can achieve this.

Pull-Request: libp2p#4969.

* fix(kad): iterator progress to be decided by any of new peers

Pull-Request: libp2p#4932.

* chore(quic): set `max_idle_timeout` to quinn default timeout

Resolves libp2p#4917.

Pull-Request: libp2p#4965.

* feat(core): impl Display on ListenerId

Fixes: libp2p#4935.

Pull-Request: libp2p#4936.

* feat(server): support websocket

Pull-Request: libp2p#4937.

* feat(swarm): implement `Copy` and `Clone` for `FromSwarm`

We can make `FromSwarm` implement `Copy` and `Close` which makes it much easier to

a) generate code in `libp2p-swarm-derive`
b) manually wrap a `NetworkBehaviour`

Previously, we couldn't do this because `ConnectionClosed` would have a `handler` field that cannot be cloned / copied.

Related: libp2p#4076.
Related: libp2p#4581.

Pull-Request: libp2p#4825.

* deps: bump wasm-bindgen-futures from 0.4.38 to 0.4.39

Pull-Request: libp2p#4946.

* feat(connection-limit): add function to mutate `ConnectionLimits`

Resolves: libp2p#4826.

Pull-Request: libp2p#4964.

* deps: bump web-sys from 0.3.65 to 0.3.66

Pull-Request: libp2p#4976.

* deps: bump wasm-bindgen-test from 0.3.38 to 0.3.39

Pull-Request: libp2p#4975.

* fix(kad): don't assume `QuerId`s are unique

We mistakenly assumed that `QueryId`s are unique in that, only a single request will be emitted per `QueryId`. This is wrong. A bootstrap for example will issue multiple requests as part of the same `QueryId`. Thus, we cannot use the `QueryId` as a key for the `FuturesMap`. Instead, we use a `FuturesTupleSet` to associate the `QueryId` with the in-flight request.

Related: libp2p#4901.
Resolves: libp2p#4948.

Pull-Request: libp2p#4971.

* fix(webrtc example): clarify idle connection timeout

When I ran the `example/browser-webrtc` example I discovered it would break after a ping or two.
The `Ping` idle timeout needed to be extended, on both the server and the wasm client, which is what this PR fixes.
I also added a small note to the README about ensuring `wasm-pack` is install for the users who are new to the ecosystem.

Fixes: libp2p#4950.

Pull-Request: libp2p#4966.

* docs(examples/readme): fix broken link

Related: libp2p#3536.

Pull-Request: libp2p#4984.

* feat(yamux): auto-tune (dynamic) stream receive window

libp2p/rust-yamux#176 enables auto-tuning for the Yamux stream receive window. While preserving small buffers on low-latency and/or low-bandwidth connections, this change allows for high-latency and/or high-bandwidth connections to exhaust the available bandwidth on a single stream.

Using the [libp2p perf](https://github.com/libp2p/test-plans/blob/master/perf/README.md) benchmark tools (60ms, 10Gbit/s) shows an **improvement from 33 Mbit/s to 1.3 Gbit/s** in single stream throughput.

See libp2p/rust-yamux#176 for details.

To ship the above Rust Yamux change in a libp2p patch release (non-breaking), this pull request uses `yamux` `v0.13` (new version) by default and falls back to `yamux` `v0.12` (old version) when setting any configuration options. Thus default users benefit from the increased performance, while power users with custom configurations maintain the old behavior.

Pull-Request: libp2p#4970.

* deps: bump actions/deploy-pages from 2 to 3

Pull-Request: libp2p#4978.

* deps: bump the axum group with 2 updates

Pull-Request: libp2p#4943.

* chore(webrtc-websys): remove unused dependencies

Pull-Request: libp2p#4973.

* chore(quic): fix link to PR in changelog

Pull-Request: libp2p#4993.

* deps: bump tokio from 1.34.0 to 1.35.0

Pull-Request: libp2p#4995.

* deps: bump syn from 2.0.39 to 2.0.40

Pull-Request: libp2p#4996.

* deps: bump once_cell from 1.18.0 to 1.19.0

Pull-Request: libp2p#4998.

---------

Co-authored-by: Predrag Gruevski <[email protected]>
Co-authored-by: Doug A <[email protected]>
Co-authored-by: Darius Clark <[email protected]>
Co-authored-by: zhiqiangxu <[email protected]>
Co-authored-by: Thomas Eizinger <[email protected]>
Co-authored-by: maqi <[email protected]>
Co-authored-by: stormshield-frb <[email protected]>
Co-authored-by: Max Inden <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: NAHO <[email protected]>
AgeManning pushed a commit to sigp/rust-libp2p that referenced this pull request Jan 15, 2024
* ci: unset `RUSTFLAGS` value in semver job

Don't fail semver-checking if a dependency version has warnings, such as deprecation notices.

Related: libp2p#4932 (comment).
Related: obi1kenobi/cargo-semver-checks#589.

Pull-Request: libp2p#4942.

* deps(webrtc): bump alpha versions

Bumps versions of `libp2p-webrtc` and `libp2p-webrtc-websys` up one minor version.

Fixes: libp2p#4953.

Pull-Request: libp2p#4959.

* feat(request-response): derive `PartialOrd`,`Ord` for `{Out,In}RequestId`

Pull-Request: libp2p#4956.

* refactor(connection-limits): make `check_limit` a free-function

Pull-Request: libp2p#4958.

* chore(webrtc-utils): bump version to allow for new release

We didn't bump this crate's version despite it depending on `libp2p_noise`. As such, we can't release `libp2p-webrtc-websys` at the moment because it needs a new release of this crate.

Pull-Request: libp2p#4968.

* feat(webrtc-websys): hide `libp2p_noise` from the public API

Currently, `libp2p-webrtc-websys` exposes the `libp2p_noise` dependency in its public API. It should really be a private dependency of the crate. By wrapping it in a new-type, we can achieve this.

Pull-Request: libp2p#4969.

* fix(kad): iterator progress to be decided by any of new peers

Pull-Request: libp2p#4932.

* chore(quic): set `max_idle_timeout` to quinn default timeout

Resolves libp2p#4917.

Pull-Request: libp2p#4965.

* feat(core): impl Display on ListenerId

Fixes: libp2p#4935.

Pull-Request: libp2p#4936.

* feat(server): support websocket

Pull-Request: libp2p#4937.

* feat(swarm): implement `Copy` and `Clone` for `FromSwarm`

We can make `FromSwarm` implement `Copy` and `Close` which makes it much easier to

a) generate code in `libp2p-swarm-derive`
b) manually wrap a `NetworkBehaviour`

Previously, we couldn't do this because `ConnectionClosed` would have a `handler` field that cannot be cloned / copied.

Related: libp2p#4076.
Related: libp2p#4581.

Pull-Request: libp2p#4825.

* deps: bump wasm-bindgen-futures from 0.4.38 to 0.4.39

Pull-Request: libp2p#4946.

* feat(connection-limit): add function to mutate `ConnectionLimits`

Resolves: libp2p#4826.

Pull-Request: libp2p#4964.

* deps: bump web-sys from 0.3.65 to 0.3.66

Pull-Request: libp2p#4976.

* deps: bump wasm-bindgen-test from 0.3.38 to 0.3.39

Pull-Request: libp2p#4975.

* fix(kad): don't assume `QuerId`s are unique

We mistakenly assumed that `QueryId`s are unique in that, only a single request will be emitted per `QueryId`. This is wrong. A bootstrap for example will issue multiple requests as part of the same `QueryId`. Thus, we cannot use the `QueryId` as a key for the `FuturesMap`. Instead, we use a `FuturesTupleSet` to associate the `QueryId` with the in-flight request.

Related: libp2p#4901.
Resolves: libp2p#4948.

Pull-Request: libp2p#4971.

* fix(webrtc example): clarify idle connection timeout

When I ran the `example/browser-webrtc` example I discovered it would break after a ping or two.
The `Ping` idle timeout needed to be extended, on both the server and the wasm client, which is what this PR fixes.
I also added a small note to the README about ensuring `wasm-pack` is install for the users who are new to the ecosystem.

Fixes: libp2p#4950.

Pull-Request: libp2p#4966.

* docs(examples/readme): fix broken link

Related: libp2p#3536.

Pull-Request: libp2p#4984.

* feat(yamux): auto-tune (dynamic) stream receive window

libp2p/rust-yamux#176 enables auto-tuning for the Yamux stream receive window. While preserving small buffers on low-latency and/or low-bandwidth connections, this change allows for high-latency and/or high-bandwidth connections to exhaust the available bandwidth on a single stream.

Using the [libp2p perf](https://github.com/libp2p/test-plans/blob/master/perf/README.md) benchmark tools (60ms, 10Gbit/s) shows an **improvement from 33 Mbit/s to 1.3 Gbit/s** in single stream throughput.

See libp2p/rust-yamux#176 for details.

To ship the above Rust Yamux change in a libp2p patch release (non-breaking), this pull request uses `yamux` `v0.13` (new version) by default and falls back to `yamux` `v0.12` (old version) when setting any configuration options. Thus default users benefit from the increased performance, while power users with custom configurations maintain the old behavior.

Pull-Request: libp2p#4970.

* deps: bump actions/deploy-pages from 2 to 3

Pull-Request: libp2p#4978.

* deps: bump the axum group with 2 updates

Pull-Request: libp2p#4943.

* chore(webrtc-websys): remove unused dependencies

Pull-Request: libp2p#4973.

* chore(quic): fix link to PR in changelog

Pull-Request: libp2p#4993.

* deps: bump tokio from 1.34.0 to 1.35.0

Pull-Request: libp2p#4995.

* deps: bump syn from 2.0.39 to 2.0.40

Pull-Request: libp2p#4996.

* deps: bump once_cell from 1.18.0 to 1.19.0

Pull-Request: libp2p#4998.

* deps: bump hkdf from 0.12.3 to 0.12.4

Pull-Request: libp2p#5009.

* deps: bump clap from 4.4.10 to 4.4.11

Pull-Request: libp2p#4997.

* deps: bump thiserror from 1.0.50 to 1.0.51

Pull-Request: libp2p#5010.

* deps: bump syn from 2.0.40 to 2.0.41

Pull-Request: libp2p#5011.

* deps: bump async-io from 2.2.1 to 2.2.2

Pull-Request: libp2p#5012.

* deps: bump rust-embed from 8.0.0 to 8.1.0

Pull-Request: libp2p#5000.

* chore(deps): bump golang.org/x/crypto from 0.7.0 to 0.17.0

Pull-Request: libp2p#5019.

* deps: bump libc from 0.2.150 to 0.2.151

Pull-Request: libp2p#5002.

* docs: remove [email protected]

I no longer have access to the mailing list. See
libp2p#5007.

Pull-Request: libp2p#5020.

* chore: fix typos

Pull-Request: libp2p#5021.

* fix(derive): restore support for inline generic type constraints

Fixes the `#[NetworkBehaviour]` macro to support generic constraints on behaviours without a where clause, which was the case before v0.51.

Pull-Request: libp2p#5003.

* deps: bump actions/deploy-pages from 3 to 4

Pull-Request: libp2p#5022.

* chore: fix several typos in documentation

Pull-Request: libp2p#5008.

* deps: bump async-trait from 0.1.74 to 0.1.75

Pull-Request: libp2p#5029.

* deps: bump anyhow from 1.0.75 to 1.0.76

Pull-Request: libp2p#5030.

* deps: bump futures-util from 0.3.29 to 0.3.30

Pull-Request: libp2p#5031.

* deps: bump syn from 2.0.41 to 2.0.43

Pull-Request: libp2p#5033.

* deps: bump tokio from 1.35.0 to 1.35.1

Pull-Request: libp2p#5034.

* deps: bump reqwest from 0.11.22 to 0.11.23

Pull-Request: libp2p#5035.

* deps: bump futures from 0.3.29 to 0.3.30

Pull-Request: libp2p#5032.

* deps: bump trybuild from 1.0.85 to 1.0.86

Pull-Request: libp2p#5036.

* deps: bump proc-macro2 from 1.0.69 to 1.0.71

Pull-Request: libp2p#5041.

* deps: bump actions/upload-pages-artifact from 2.0.0 to 3.0.0

Pull-Request: libp2p#5023.

* deps: bump Rust to 1.75 and fix clippy lints

Pull-Request: libp2p#5043.

* deps: bump thiserror from 1.0.51 to 1.0.53

Pull-Request: libp2p#5044.

* deps: bump clap from 4.4.11 to 4.4.12

Pull-Request: libp2p#5046.

* deps: bump tempfile from 3.8.1 to 3.9.0

Pull-Request: libp2p#5047.

* deps: bump rust-embed from 8.1.0 to 8.2.0

Pull-Request: libp2p#5049.

* deps: bump serde_json from 1.0.108 to 1.0.109

Pull-Request: libp2p#5050.

* deps: bump anyhow from 1.0.76 to 1.0.78

Pull-Request: libp2p#5051.

* deps: bump proc-macro2 from 1.0.71 to 1.0.73

Pull-Request: libp2p#5054.

* deps: bump quote from 1.0.33 to 1.0.34

Pull-Request: libp2p#5055.

* deps: bump anyhow from 1.0.78 to 1.0.79

Pull-Request: libp2p#5062.

* deps: bump serde_json from 1.0.109 to 1.0.111

Pull-Request: libp2p#5063.

* deps: bump thiserror from 1.0.53 to 1.0.56

Pull-Request: libp2p#5064.

* deps: bump libc from 0.2.151 to 0.2.152

Pull-Request: libp2p#5065.

* deps: bump trybuild from 1.0.86 to 1.0.88

Pull-Request: libp2p#5068.

* deps: bump proc-macro2 from 1.0.73 to 1.0.76

Pull-Request: libp2p#5069.

* deps: bump clap from 4.4.12 to 4.4.13

Pull-Request: libp2p#5070.

* deps: bump Swatinem/rust-cache from 2.7.1 to 2.7.2

Pull-Request: libp2p#5076.

* deps: bump tj-actions/glob from 17 to 18

Pull-Request: libp2p#5058.

* deps: bump the axum group with 1 update

Pull-Request: libp2p#5045.

* deps: bump quote from 1.0.34 to 1.0.35

Pull-Request: libp2p#5071.

* deps: bump async-trait from 0.1.75 to 0.1.77

Pull-Request: libp2p#5081.

* ci: add dependabot group for webrtc

Pull-Request: libp2p#5082.

* deps: bump base64 from 0.21.5 to 0.21.7

Pull-Request: libp2p#5086.

* deps: bump trybuild from 1.0.88 to 1.0.89

Pull-Request: libp2p#5087.

* deps: bump js-sys from 0.3.66 to 0.3.67

Pull-Request: libp2p#5091.

* deps: bump wasm-bindgen from 0.2.89 to 0.2.90

Pull-Request: libp2p#5089.

* add PeerId to ListenFailure

---------

Co-authored-by: Predrag Gruevski <[email protected]>
Co-authored-by: Doug A <[email protected]>
Co-authored-by: Darius Clark <[email protected]>
Co-authored-by: zhiqiangxu <[email protected]>
Co-authored-by: Thomas Eizinger <[email protected]>
Co-authored-by: maqi <[email protected]>
Co-authored-by: stormshield-frb <[email protected]>
Co-authored-by: Max Inden <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: NAHO <[email protected]>
Co-authored-by: alex <[email protected]>
Co-authored-by: Akosh Farkash <[email protected]>
Co-authored-by: Frieren <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

protocols/kad/: Replace manual procedural state machines with async/await
3 participants