Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(swarm): set default for idle-connection-timeout to 10s #4967

Merged
merged 10 commits into from
Dec 13, 2024

Conversation

thomaseizinger
Copy link
Contributor

Description

With the move to a global idle-connection-timeout, connections are being closed much more aggressively. This causes problems in situations where, e.g. an application wants to use a connection shortly after an event has been emitted from the Swarm. With a default of 0 seconds, such a connection is instantly considered idle and therefore closed, despite the application wanting to use it again just moments later. Whilst it is possible to structure application code to mitigate this, it is unnecessarily complicated.

Additionally, connections being closed instantly if not in use is a foot-gun for newcomers to the library.

From a technical point-of-view, instantly closing idle connections is nice. In reality, it is an impractical default. Hence, we change this default to 10s.

10 seconds is considered to be an acceptable default as it strikes a balance between allowing some pause between network activity, yet frees up resources that are (supposedly) no longer needed.

Resolves: #4912.

Notes & open questions

Change checklist

  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • A changelog entry has been made in the appropriate crates

@DougAnderson444
Copy link
Contributor

I like this idea. Anyone coming to libp2p is likely going to use their connection for more than 0s, so giving them this default makes sense.

I'm not that familiar with what makes each connection or protocol idle vs what keeps them alive (we recently discusse Ping in #4950 but there are more) -- so neither would other newcomers to libp2p. Could we potentially add this description to the documentation somewhere? It's like Thomas says, we don't need new users fighting against auto-closing connections when they might want to keep an open connection for their app. Another part of this is a note in the docs where applicable.

Thanks for starting this!

@thomaseizinger
Copy link
Contributor Author

I'm not that familiar with what makes each connection or protocol idle vs what keeps them alive (we recently discusse Ping in #4950 but there are more) -- so neither would other newcomers to libp2p. Could we potentially add this description to the documentation somewhere?

The main place where this is currently documented is in https://docs.rs/libp2p/latest/libp2p/swarm/trait.ConnectionHandler.html#method.connection_keep_alive. I think users shouldn't need to know this, only implementers of protocols.

I like this idea. Anyone coming to libp2p is likely going to use their connection for more than 0s, so giving them this default makes sense.

There is some (I guess unintended) irony in this. We automatically keep connections alive while you use them, i.e. while there are active streams.

What this default does is bridge brief moments of inactivity between uses which would otherwise result in an immediate shutdown of the connection because it is idle.

What do you think of the idea voiced in #4912 to not close idle connections at all by default and make it the users responsibility?

@drHuangMHT
Copy link
Contributor

Is there anything blocking this PR?

@dariusc93
Copy link
Member

Is there anything blocking this PR?

dont believe so unless there are some case points to show 10s isnt enough or too much time. I have no objections to this change though

@jxs jxs marked this pull request as ready for review November 29, 2024 17:59
@jxs jxs force-pushed the feat/idle-timeout branch from df20dbc to a83d6b2 Compare November 29, 2024 18:07
dariusc93
dariusc93 previously approved these changes Nov 29, 2024
@jxs jxs added the internal-change Pull requests that make internal changes to crates and thus don't need to include a changelog entry. label Nov 29, 2024
@jxs jxs force-pushed the feat/idle-timeout branch from e95fb60 to cb44c82 Compare November 29, 2024 21:12
jxs
jxs previously approved these changes Nov 29, 2024
Copy link
Member

@jxs jxs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok seems ready to be merged now, @dariusc93 @guillaumemichel @elenaf9 @drHuangMHT do you feel this should be a breaking change instead?

elenaf9
elenaf9 previously approved these changes Nov 29, 2024
Copy link
Contributor

@elenaf9 elenaf9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't need to be a breaking change IMO.

@dariusc93
Copy link
Member

Ok seems ready to be merged now, @dariusc93 @guillaumemichel @elenaf9 @drHuangMHT do you feel this should be a breaking change instead?

I dont believe this would be a breaking change since its just changing the default value. Of course, I am not sure of who might be using the current default idle timeout either.

@drHuangMHT
Copy link
Contributor

Ok seems ready to be merged now, @dariusc93 @guillaumemichel @elenaf9 @drHuangMHT do you feel this should be a breaking change instead?

Nah I don't think so.
Fun fact: I didn't change the default timeout but it still works as intended, probably because I have connection set to be kept alive somewhere

@jxs jxs added the send-it label Nov 30, 2024
@drHuangMHT
Copy link
Contributor

The required interop testing could not be done due to this error:

Error: 2.473 test/fixtures/relay.ts(20,5): error TS2561: Object literal may only specify known properties, 
but 'connectionEncryption' does not exist in type 'Libp2pOptions<{ identify: Identify; relay: CircuitRelayService; }>'. 
Did you mean to write 'connectionEncrypters'?  

@mergify mergify bot dismissed stale reviews from jxs, guillaumemichel, elenaf9, and dariusc93 December 7, 2024 23:53

Approvals have been dismissed because the PR was updated after the send-it label was applied.

@jxs
Copy link
Member

jxs commented Dec 13, 2024

The required interop testing could not be done due to this error:

Error: 2.473 test/fixtures/relay.ts(20,5): error TS2561: Object literal may only specify known properties, 
but 'connectionEncryption' does not exist in type 'Libp2pOptions<{ identify: Identify; relay: CircuitRelayService; }>'. 
Did you mean to write 'connectionEncrypters'?  

yeah it's a known issue, see libp2p/test-plans#588
I am going to manually merge this one, as it's probably going to take some days till the interop-tests problem gets solved

@jxs jxs merged commit f4edafb into master Dec 13, 2024
68 of 70 checks passed
@jxs jxs deleted the feat/idle-timeout branch December 13, 2024 16:28
jxs pushed a commit to jxs/rust-libp2p that referenced this pull request Jan 6, 2025
)

## Description

With the move to a global idle-connection-timeout, connections are being
closed much more aggressively. This causes problems in situations where,
e.g. an application wants to use a connection shortly after an event has
been emitted from the `Swarm`. With a default of 0 seconds, such a
connection is instantly considered idle and therefore closed, despite
the application wanting to use it again just moments later. Whilst it is
possible to structure application code to mitigate this, it is
unnecessarily complicated.

Additionally, connections being closed instantly if not in use is a
foot-gun for newcomers to the library.

From a technical point-of-view, instantly closing idle connections is
nice. In reality, it is an impractical default. Hence, we change this
default to 10s.

10 seconds is considered to be an acceptable default as it strikes a
balance between allowing some pause between network activity, yet frees
up resources that are (supposedly) no longer needed.

Resolves: libp2p#4912.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
internal-change Pull requests that make internal changes to crates and thus don't need to include a changelog entry. send-it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Set a better default for idle_connection_timeout
7 participants