-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow tunnel server upgrade without disconnecting user environments #233
Comments
I think a problem we have is once there's two new tunnel servers (old and new) and:
If we want to support only the case of upgrade and HA, An alternative solution can be to use sort-of modified blue-green deployment of two instances with swap mechanisms:
During this time, the old deployment is still alive and forward traffic to the CTA. |
You're right, it won't work as I suggested. I like the blue/green idea, but I think there's a way to do it without two URLs at the CTA. By extracting the stunnel/sslh to a separate deployment, we can define different k8s services for the SSH and HTTP endpoints. Normally they will point to the same deployment. When upgrading:
|
This sounds a bit painful tho and if it's covered by the distributed tunnel server solution, maybe it's best to wait for that. |
We can use DNS SRV records so the CTA will need to know only one URL, it can be optional for simplicity. Deployment itself shouldn't be too difficult with K8S -> |
The different services approach (stunnel/sslh) with a single tunnel server deployment is a bit tricky because there are multiple CTAs here, so there isn't a single switch. |
Currently, when deploying the tunnel server, user environments will be briefly disconnected while the CTA (agent) reconnects to the new instance. This can cause incoming requests to the environments to fail with 502 "environment not found" errors.
Suggested solution - a cooperative rollout flow, compatible with Kubernetes rolling update (although it's quite generic and can be used with other orchestration infra).
SIGTERM
to start a graceful shutdown flow. It will notify its connected clients to reconnect (see below). It will then wait for all its client connections to end, or a configurable timeout has passed, then exit.Currently there is no simple way for the SSH server to notify its clients of an event. An applicative "server events" channel can be created by having the CTA initiate a specific "control" command session (exec) on its client connection, and wait for it to end as a signal. Alternatively, instead of using the SSH connection, the CTA can accept an HTTP request on its own API endpoint. However, this requires the tunnel server to identify the specific tunnel for each connected CTA.
The text was updated successfully, but these errors were encountered: