diff --git a/README.md b/README.md index f530290..b6b101f 100644 --- a/README.md +++ b/README.md @@ -16,22 +16,30 @@ in a Kubernetes cluster. The source of the content could be another node in the #### Important Disclaimer -This is **work in progress** and not yet production ready. We are actively working on this project and would love to -hear your feedback. Please feel free to open an issue or a pull request. +**Work in Progress**: We are actively working on this project and would love to hear your feedback. +Please feel free to open an issue or a pull request. ## Usage -Peerd is designed to be deployed as a daemonset on every node in a Kubernetes cluster and acts as a registry mirror. It -discovers and serves content from other nodes in the cluster, and can also download content from an upstream source. +**Peer D**aemon is designed to be deployed as a daemonset on every node in a Kubernetes cluster and acts as a registry +mirror. + +* It discovers other nodes in the cluster and establishes a peer-to-peer overlay network in the cluster using the + [Kademlia DHT][white paper] protocol. + +* It discovers content such as OCI images in the node's containerd content store as well as streamable container files, + such as used in [Azure Artifact Streaming][ACR Artifact Streaming], and advertises them to its peers. + +* It can serve discovered/cached content to other nodes in the cluster, acting as a mirror for the content. This is useful in the following scenarios: -1. **Increased Throughput**: For downloading large images or deploying large clusters, the registry can become a - bottleneck. Peerd can be used to download images from other nodes in the cluster that have already downloaded it, +1. **Increased Throughput**: For downloading large images or deploying large clusters, the container/artifact registry + can become a bottleneck. Peerd can be used to download images from other nodes in the cluster that have already downloaded it, increasing throughput. -2. **Improved Fault Tolerance**: If the registry is unavailable, Peerd can still serve images from other nodes in the - cluster. +2. **Improved Fault Tolerance**: If the upstream registry is unavailable, Peerd can still serve images from other nodes + in the cluster. 3. **Firewall configuration**: Peerd can be used to download images from other nodes in the cluster. This can be useful in scenarios where outbound internet access is restricted on some nodes. diff --git a/assets/images/cluster.png b/assets/images/cluster.png deleted file mode 100644 index 5dd7c67..0000000 Binary files a/assets/images/cluster.png and /dev/null differ diff --git a/assets/mermaid/peerd-dht-topo.md b/assets/mermaid/peerd-dht-topo.md new file mode 100644 index 0000000..edde4b9 --- /dev/null +++ b/assets/mermaid/peerd-dht-topo.md @@ -0,0 +1,54 @@ +graph TD; + subgraph Cluster[DHT Topology in a Kubernetes Cluster] + direction LR + subgraph peerd-1[Peerd] + dht-1(DHT) + end + + subgraph peerd-2[Peerd] + dht-2(DHT) + end + + subgraph peerd-3[Peerd] + dht-3(DHT) + end + + subgraph Node1[Node A] + peerd-1 + end + + subgraph Node2[Node B] + peerd-2(Peerd) + end + + subgraph Node3[Node C] + peerd-3(Peerd) + end + + subgraph k8s-api[K8s API Server] + lease-1((("Peerd Leader + Lease Resource"))) + end + end + + dht-1 o-.-o |Initialize

| lease-1 + dht-2 o-.-o |Initialize

| lease-1 + dht-3 o-.-o |Initialize

| lease-1 + + dht-1 <==> |State

| dht-2 + dht-1 <==> |State

| dht-3 + dht-2 <==> |State

| dht-3 + + classDef cluster fill:#fafafa,stroke:#bbb,stroke-width:2px,color:#326ce5; + class Node1,NodeN cluster + + classDef outer fill:#e0f7fa,stroke:#00008b,stroke-width:2px,color:#a9a9a9; + class Cluster outer + + subgraph Legend[Legend] + direction TB + tls[Initialize - TLS connections] + mtls[State - mTLS connections] + end + + Cluster ~~~ Legend diff --git a/assets/mermaid/peerd-pull-seq.md b/assets/mermaid/peerd-pull-seq.md new file mode 100644 index 0000000..01cfad6 --- /dev/null +++ b/assets/mermaid/peerd-pull-seq.md @@ -0,0 +1,40 @@ +sequenceDiagram + Title: Peer-to-Peer Image Pulling in a Kubernetes Cluster + + box white Node A + participant Nginx Pod + participant Containerd Client + participant Peerd-A + end + + box white Node N + participant Peerd-N + end + + box white Upstream Registry + participant Upstream + end + + loop Every layer + Containerd Client->>Peerd-A: GET sha256:l1 + Note over Containerd Client,Peerd-A: 1 + + alt peer found + Peerd-A->>Peerd-N: GET sha256:l1 + Note over Peerd-A,Peerd-N: 2 + activate Peerd-N + Peerd-N->>Peerd-A: result + Peerd-A->>Containerd Client: result + else upstream request + Containerd Client->>Upstream: GET sha256:l1 + Note over Peerd-A,Upstream: 3 + Upstream->>Containerd Client: result + end + + opt Advertise state (async) + activate Peerd-A + Note right of Peerd-A: Advertise state from containerd content store + end + end + + Containerd Client-->Nginx Pod: start diff --git a/assets/mermaid/peerd-pull.md b/assets/mermaid/peerd-pull.md new file mode 100644 index 0000000..664b2a8 --- /dev/null +++ b/assets/mermaid/peerd-pull.md @@ -0,0 +1,67 @@ +graph TB; + subgraph Cluster[Peer-to-Peer Image Pulling in a Kubernetes Cluster] + direction LR + subgraph app-1[mcr.microsoft.com/nginx:latest] + app-pod-1((Pod)) + end + + subgraph ctr-1[Containerd] + client-1{Client} + store-1[[Content Store]] + + subgraph hosts-1["Containerd Hosts Configuration"] + h-1[(mcr.microsoft.com pull mirror: peerd)] + end + end + + subgraph peerd-1[Peerd] + proxy-1(Proxy) + sub-1(((Subscription))) + end + + subgraph Node1[Node A] + hosts-1 + app-1 + peerd-1 + ctr-1 + end + + subgraph NodeN[Node N] + peerd-n(Peerd) + end + + end + + subgraph manifest-1[mcr.microsft.com/nginx@sha256:m1] + direction TB + c-1[config sha256:c1] + l-1[layer sha256:l1] + l-2[layer sha256:l2] + end + + subgraph Upstream[Upstream Container Registry] + acr(mcr.microsoft.com) + end + + hosts-1 ~~~ client-1 + c-1 ~~~ l-1 + l-1 ~~~ l-2 + + client-1 --> |1| proxy-1 + proxy-1 -.-> |    2| peerd-n + client-1 -.-> |    3| acr + client-1 --o |    4| app-1 + + sub-1 o-.-o store-1 + sub-1 o-.-o |Advertise| peerd-n + + classDef containerd fill:#e0ffff,stroke:#000,stroke-width:4px,color:#000; + + classDef cluster fill:#fafafa,stroke:#bbb,stroke-width:2px,color:#326ce5; + class Node1,NodeN cluster + + classDef registry fill:#e0f7fa,stroke:#00008b,stroke-width:2px,color:#326ce5; + class acr registry + + classDef outer fill:#e0f7fa,stroke:#00008b,stroke-width:2px,color:#a9a9a9; + class Cluster outer diff --git a/assets/mermaid/peerd-streaming-seq.md b/assets/mermaid/peerd-streaming-seq.md new file mode 100644 index 0000000..20a8b0e --- /dev/null +++ b/assets/mermaid/peerd-streaming-seq.md @@ -0,0 +1,55 @@ +sequenceDiagram + Title: Peer-to-Peer Artifact Streaming in a Kubernetes Cluster + + box white Node A + participant Nginx Pod + participant File System + participant Overlaybd TCMU + participant Peerd-A + end + + box white Node N + participant Peerd-N + end + + box white Upstream Registry + participant Upstream + end + + Nginx Pod->>File System: Read bytes 30-2000 from data.csv + Note over Nginx Pod,File System: 1 + + File System->>Overlaybd TCMU: Read bytes 30-2000 from data.csv + Note over File System,Overlaybd TCMU: 2 + + Overlaybd TCMU->>Peerd-A: Fetch file data.csv 'Range: bytes=30-2000' + Note over Overlaybd TCMU,Peerd-A: 3 + activate Peerd-A + + alt bytes cached + Peerd-A->>Overlaybd TCMU: result + else peer found + Peerd-A->>Peerd-N: Fetch file data.csv 'Range: bytes=30-2000' + Note over Peerd-A,Peerd-N: 4 + activate Peerd-N + Peerd-N->>Peerd-A: result + Peerd-A->>Overlaybd TCMU: result + else upstream request + Peerd-A->>Upstream: Fetch file data.csv 'Range: bytes=30-2000' + Note over Peerd-A,Upstream: 5 + Upstream->>Peerd-A: result + Peerd-A->>Overlaybd TCMU: result + end + + opt Optimistic File Prefetch + activate Peerd-A + Note right of Peerd-A: Prefetch entire file from peers/upstream + end + + opt Advertise state (async) + activate Peerd-A + Note right of Peerd-A: Advertise state from files cache + end + + Overlaybd TCMU->>File System: result + File System->>Nginx Pod: result diff --git a/assets/mermaid/peerd-streaming.md b/assets/mermaid/peerd-streaming.md new file mode 100644 index 0000000..6524863 --- /dev/null +++ b/assets/mermaid/peerd-streaming.md @@ -0,0 +1,58 @@ +graph RL; + subgraph Cluster[Peer-to-Peer Artifact Streaming in a Kubernetes Cluster] + direction LR + subgraph kernel-1[Kernel] + fs-1[Filesystem] + end + + subgraph app-1[Nginx] + app-pod-1((Pod)) + end + + subgraph overlaybd-1["(User Space)"] + driver-1["Overlaybd + TCMU"] + end + + subgraph peerd-1[Peerd] + proxy-1(Proxy) + files-1(("Files + Cache")) + end + + subgraph Node1[Node A] + kernel-1 + app-1 + overlaybd-1 + peerd-1 + end + + subgraph NodeN[Node N] + peerd-n(Peerd) + end + + files-1 o-.-o |
Advertise
| peerd-n + + app-pod-1 --> |
1
| fs-1 + fs-1 -.-> |
2
| driver-1 + driver-1 --> |
3
| proxy-1 + proxy-1 <-.-> |
4
| peerd-n + end + + subgraph Upstream[Upstream Container Registry] + acr(mcr.microsoft.com) + end + + proxy-1 -.-> |
5
| acr + + classDef userspace fill:#e0ffff,stroke:#000,stroke-width:4px,color:#000; + class proxy-1,files-1,driver-1,app-pod-1,peerd-n userspace + + classDef cluster fill:#fafafa,stroke:#bbb,stroke-width:2px,color:#326ce5; + class Node1,NodeN cluster + + classDef registry fill:#e0f7fa,stroke:#00008b,stroke-width:2px,color:#326ce5; + class acr registry + + classDef outer fill:#e0f7fa,stroke:#00008b,stroke-width:2px,color:#a9a9a9; + class Cluster outer diff --git a/assets/mermaid/rendered/peerd-dht-topo.png b/assets/mermaid/rendered/peerd-dht-topo.png new file mode 100644 index 0000000..9e974e8 Binary files /dev/null and b/assets/mermaid/rendered/peerd-dht-topo.png differ diff --git a/assets/mermaid/rendered/peerd-pull-seq.png b/assets/mermaid/rendered/peerd-pull-seq.png new file mode 100644 index 0000000..3bfd955 Binary files /dev/null and b/assets/mermaid/rendered/peerd-pull-seq.png differ diff --git a/assets/mermaid/rendered/peerd-pull.png b/assets/mermaid/rendered/peerd-pull.png new file mode 100644 index 0000000..cb932db Binary files /dev/null and b/assets/mermaid/rendered/peerd-pull.png differ diff --git a/assets/mermaid/rendered/peerd-streaming-seq.png b/assets/mermaid/rendered/peerd-streaming-seq.png new file mode 100644 index 0000000..041d8bd Binary files /dev/null and b/assets/mermaid/rendered/peerd-streaming-seq.png differ diff --git a/assets/mermaid/rendered/peerd-streaming.png b/assets/mermaid/rendered/peerd-streaming.png new file mode 100644 index 0000000..e55ed04 Binary files /dev/null and b/assets/mermaid/rendered/peerd-streaming.png differ diff --git a/docs/design.md b/docs/design.md index 6513e47..a81fab7 100644 --- a/docs/design.md +++ b/docs/design.md @@ -1,10 +1,26 @@ # Peerd Design -![cluster-arch] - The design is inspired from the [Spegel] project, which is a peer to peer proxy for container images that uses libp2p. In this section, we describe the design and architecture of `peerd`. +#### **DHT Topology** + +| | +| ----------------- | +| ![peerd-dht-topo] | + +#### **Image Pulls Description** + +| | | +| ------------- | ----------------- | +| ![peerd-pull] | ![peerd-pull-seq] | + +#### **Image Streaming Description** + +| | | +| ------------------ | ---------------------- | +| ![peerd-streaming] | ![peerd-streaming-seq] | + ### Background An OCI image is composed of multiple layers, where each layer is stored as a blob in the registry. When a container @@ -188,6 +204,10 @@ running this container in p2p vs non-p2p mode on a 3 node AKS cluster with Artif --- -[cluster-arch]: ../assets/images/cluster.png [file-system-layout]: ../assets/images/file-system-layout.png [Spegel]: https://github.com/XenitAB/spegel +[peerd-pull]: ../assets/mermaid/rendered/peerd-pull.png +[peerd-pull-seq]: ../assets/mermaid/rendered/peerd-pull-seq.png +[peerd-streaming]: ../assets/mermaid/rendered/peerd-streaming.png +[peerd-streaming-seq]: ../assets/mermaid/rendered/peerd-streaming-seq.png +[peerd-dht-topo]: ../assets/mermaid/rendered/peerd-dht-topo.png