Skip to content

Commit

Permalink
Added section on embedded HA to Architecture page (k3s-io#122)
Browse files Browse the repository at this point in the history
* Add diagram for etcd HA
* Add tabs, remove old version gates

Signed-off-by: Derek Nola <[email protected]>

* Clean up language around fixed registration address

Signed-off-by: Derek Nola <[email protected]>
  • Loading branch information
dereknola authored May 1, 2023
1 parent dc484d8 commit ded1b35
Show file tree
Hide file tree
Showing 9 changed files with 3,157 additions and 334 deletions.
45 changes: 31 additions & 14 deletions docs/architecture/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ weight: 1

import ThemedImage from '@theme/ThemedImage';
import useBaseUrl from '@docusaurus/useBaseUrl';
import TabItem from '@theme/TabItem';
import Tabs from '@theme/Tabs';


This page describes the architecture of a high-availability K3s server cluster and how it differs from a single-node server cluster.

Expand All @@ -30,25 +33,43 @@ In this configuration, each agent node is registered to the same server node. A
}}
/>

### High-Availability K3s

### High-Availability K3s Server with an External DB
Single server clusters can meet a variety of use cases, but for environments where uptime of the Kubernetes control plane is critical, you can run K3s in an HA configuration. An HA K3s cluster comprises:

Single server clusters can meet a variety of use cases, but for environments where uptime of the Kubernetes control plane is critical, you can run K3s in an HA configuration. An HA K3s cluster is comprised of:
<Tabs>
<TabItem value="Embedded DB">

* Three or more **server nodes** that will serve the Kubernetes API and run other control plane services
* An **embedded etcd datastore** (as opposed to the embedded SQLite datastore used in single-server setups)

* Two or more **server nodes** that will serve the Kubernetes API and run other control plane services
* An **external datastore** (as opposed to the embedded SQLite datastore used in single-server setups)

<ThemedImage
alt="K3s Architecture with High-availability Servers"
sources={{
light: useBaseUrl('/img/k3s-architecture-ha-server.svg'),
dark: useBaseUrl('/img/k3s-architecture-ha-server-dark.svg'),
}}
/>
light: useBaseUrl('/img/k3s-architecture-ha-embedded.svg'),
dark: useBaseUrl('/img/k3s-architecture-ha-embedded-dark.svg'),
}} />

</TabItem>
<TabItem value="External DB">

* Two or more **server nodes** that will serve the Kubernetes API and run other control plane services
* An **external datastore** (such as MySQL, PostgreSQL, or etcd)

<ThemedImage
alt="K3s Architecture with High-availability Servers and an External DB"
sources={{
light: useBaseUrl('/img/k3s-architecture-ha-external.svg'),
dark: useBaseUrl('/img/k3s-architecture-ha-external-dark.svg'),
}} />

</TabItem>
</Tabs>

### Fixed Registration Address for Agent Nodes

In the high-availability server configuration, each node must also register with the Kubernetes API by using a fixed registration address, as shown in the diagram below.
In the high-availability server configuration, each node can also register with the Kubernetes API by using a fixed registration address, as shown in the diagram below.

After registration, the agent nodes establish a connection directly to one of the server nodes.

Expand All @@ -62,14 +83,10 @@ After registration, the agent nodes establish a connection directly to one of th

### How Agent Node Registration Works

Agent nodes are registered with a websocket connection initiated by the `k3s agent` process, and the connection is maintained by a client-side load balancer running as part of the agent process. This load-balancer maintains stable connections to all servers in the cluster, providing a connection to the apiserver that tolerates outages of individual servers.
Agent nodes are registered with a websocket connection initiated by the `k3s agent` process, and the connection is maintained by a client-side load balancer running as part of the agent process. Initially, the agent connects to the supervisor (and kube-apiserver) via the local load-balancer on port 6443. The load-balancer maintains a list of available endpoints to connect to. The default (and initially only) endpoint is seeded by the hostname from the `--server` address. Once it connects to the cluster, the agent retrieves a list of kube-apiserver addresses from the Kubernetes service endpoint list in the default namespace. Those endpoints are added to the load balancer, which then maintains stable connections to all servers in the cluster, providing a connection to the kube-apiserver that tolerates outages of individual servers.

Agents will register with the server using the node cluster secret along with a randomly generated password for the node, stored at `/etc/rancher/node/password`. The server will store the passwords for individual nodes as Kubernetes secrets, and any subsequent attempts must use the same password. Node password secrets are stored in the `kube-system` namespace with names using the template `<host>.node-password.k3s`. This is done to protect the integrity of node IDs.

If the `/etc/rancher/node` directory of an agent is removed, or you wish to rejoin a node using an existing name, the node should be deleted from the cluster. This will clean up both the old node entry, and the node password secret, and allow the node to (re)join the cluster.

:::note
Prior to K3s v1.20.2 servers stored passwords on disk at `/var/lib/rancher/k3s/server/cred/node-passwd`.
:::

If you frequently reuse hostnames, but are unable to remove the node password secrets, a unique node ID can be automatically appended to the hostname by launching K3s servers or agents using the `--with-node-id` flag. When enabled, the node ID is also stored in `/etc/rancher/node/`.
12 changes: 2 additions & 10 deletions docs/datastore/ha-embedded.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,15 +3,6 @@ title: "High Availability Embedded etcd"
weight: 40
---

:::info Version Gate
Full support as of [v1.19.5+k3s1](https://github.com/k3s-io/k3s/releases/tag/v1.19.5%2Bk3s1)
Experimental support as of [v1.19.1+k3s1](https://github.com/k3s-io/k3s/releases/tag/v1.19.1%2Bk3s1)
:::

:::note Notice: Deprecated Dqlite
Embedded etcd replaced experimental Dqlite in the K3s v1.19.1 release. This is a breaking change. Please note that upgrades from experimental Dqlite to embedded etcd are not supported. If you attempt an upgrade it will not succeed and data will be lost.
:::

:::caution
Embedded etcd (HA) may have performance issues on slower disks such as Raspberry Pis running with SD cards.
:::
Expand All @@ -36,9 +27,10 @@ $ kubectl get nodes
NAME STATUS ROLES AGE VERSION
server1 Ready control-plane,etcd,master 28m vX.Y.Z
server2 Ready control-plane,etcd,master 13m vX.Y.Z
server3 Ready control-plane,etcd,master 10m vX.Y.Z
```

Now you have a highly available control plane. Any successfully clustered servers can be used in the `--server` argument to join additional server and worker nodes. Joining additional worker nodes to the cluster follows the same procedure as a single server cluster.
Now you have a highly available control plane. Any successfully clustered servers can be used in the `--server` argument to join additional server and agent nodes. Joining additional agent nodes to the cluster follows the same procedure as a single server cluster.

There are a few config flags that must be the same in all server nodes:

Expand Down
6 changes: 2 additions & 4 deletions docs/datastore/ha.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,6 @@ title: High Availability External DB
weight: 30
---

> **Note:** Official support for installing Rancher on a Kubernetes cluster was introduced in our v1.0.0 release.
This section describes how to install a high-availability K3s cluster with an external database.

Single server clusters can meet a variety of use cases, but for environments where uptime of the Kubernetes control plane is critical, you can run K3s in an HA configuration. An HA K3s cluster is comprised of:
Expand All @@ -14,7 +12,7 @@ Single server clusters can meet a variety of use cases, but for environments whe
* An **external datastore** (as opposed to the embedded SQLite datastore used in single-server setups)
* A **fixed registration address** that is placed in front of the server nodes to allow agent nodes to register with the cluster

For more details on how these components work together, refer to the [architecture section.](../architecture/architecture.md#high-availability-k3s-server-with-an-external-db)
For more details on how these components work together, refer to the [architecture section.](../architecture/architecture.md#high-availability-k3s)

Agents register through the fixed registration address, but after registration they establish a connection directly to one of the server nodes. This is a websocket connection initiated by the `k3s agent` process, it is maintained by a client-side load balancer running as part of the agent process.

Expand Down Expand Up @@ -50,7 +48,7 @@ By default, server nodes will be schedulable and thus your workloads can get lau

Once you've launched the `k3s server` process on all server nodes, ensure that the cluster has come up properly with `k3s kubectl get nodes`. You should see your server nodes in the Ready state.

### 3. Configure the Fixed Registration Address
### 3. Optional: Configure a Fixed Registration Address

Agent nodes need a URL to register against. This can be the IP or hostname of any of the server nodes, but in many cases those may change over time. For example, if you are running your cluster in a cloud that supports scaling groups, you may scale the server node group up and down over time, causing nodes to be created and destroyed and thus having different IPs from the initial set of server nodes. Therefore, you should have a stable endpoint in front of the server nodes that will not change over time. This endpoint can be set up using any number approaches, such as:

Expand Down
2 changes: 1 addition & 1 deletion docs/installation/network-options.md
Original file line number Diff line number Diff line change
Expand Up @@ -178,7 +178,7 @@ and on agents:
--node-external-ip=<AGENT_EXTERNAL_IP>
```

where `SERVER_EXTERNAL_IP` is the IP through which we can reach the server node and `AGENT_EXTERNAL_IP` is the IP through which we can reach the agent/worker node. Note that the `K3S_URL` config parameter in the agent/worker should use the `SERVER_EXTERNAL_IP` to be able to connect to it. Remember to check the [Networking Requirements](../installation/requirements.md#networking) and allow access to the listed ports on both internal and external addresses.
where `SERVER_EXTERNAL_IP` is the IP through which we can reach the server node and `AGENT_EXTERNAL_IP` is the IP through which we can reach the agent node. Note that the `K3S_URL` config parameter in the agent should use the `SERVER_EXTERNAL_IP` to be able to connect to it. Remember to check the [Networking Requirements](../installation/requirements.md#networking) and allow access to the listed ports on both internal and external addresses.

Both `SERVER_EXTERNAL_IP` and `AGENT_EXTERNAL_IP` must have connectivity between them and are normally public IPs.

Expand Down
Loading

0 comments on commit ded1b35

Please sign in to comment.