Resumable protocol

Overview

The client maintains a pool of resumable TCP connections. Periodically, the "worst" is reset, "worst" being something like the one with the worst worst-case latency.

An easy way of implementing this is to have one goroutine continually reopen dead connections and another one sleeping and killing the worst one. Of course, timeouts and stuff will also force reopenings. Reopenings might discover that the server has already discarded the previous smux state, in which case the client will do so too.

Client-exit protocol

Resumable TCP mode is indicated with tinySS protocol code R. The initial authentication procedure is the same, but after this the following phases happen:

Client hello

Client simply sends a hello:

32 bytes: "metasession" only used for statistics
32 bytes: globally unique session ID that identifies a smux context

Server response

Server simply sends one byte:

0: new smux session generated
1: old smux session resumed

Client algorithm

There's a persistent pool of N session IDs, as well as a pool of up to N smux sessions.

A monitor goroutine continually creates connections and adds them to the pool when there are not enough connections in the pool.

Each session is put in a structure that includes an underlying resumable TCP connection. When creating a connection, after the hello, if the server says 1, the existing resumable TCP connection is resumed. Otherwise, the smux context is thrown away and the whole struct is recreated.

The smux contexts would have impractically long timeout intervals, since we don't rely on that for anything.

The timeout for marking death would be 1 second + worst latency seen in the past 5 minutes. Very occasionally marking death spuriously is ok because the sessions are resumable.

This whole pool is what's visualized in the Geph GUI, replacing e2e info.

Provide feedback

Saved searches