Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi node support #9

Open
wants to merge 17 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1 +1,2 @@
*.retry
__pycache__
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,4 @@ install:
- pip install testinfra molecule docker

script:
- molecule --debug test
- travis_wait 30 molecule test
138 changes: 132 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
```yaml
- src: https://github.com/5monkeys/ansible-docker-role
name: docker
```
```

* Update `ansible.cfg` to search for roles relative to playbook:

Expand Down Expand Up @@ -46,12 +46,12 @@ docker_storage_driver: "overlay2"
docker_python_version: "4.0.1"

# If TLS should be enabled on the docker daemon and SSL-certificates generated
docker_use_tls: true
docker_use_tls: "{{ 'docker_swarm_managers' in group_names }}"
# What to set as Organization in SSL-certificates
docker_tls_organization: "Acme"
# Where to place certificates on host
docker_tls_path: "/etc/docker/certs"
# When the client certificate should expire.
# When the client certificate should expire.
docker_tls_client_expires_after: "+52w"
# The client certificate common name
docker_tls_client_common_name: "client"
Expand All @@ -60,18 +60,144 @@ docker_tls_client_common_name: "client"
docker_enable_swarm: true
# What version of the python openssl library to use
docker_py_openssl_version: "19.0.0"

# These are only relevant when 'docker_enable_swarm' is true
docker_swarm_interface: "{{ ansible_default_ipv4['interface'] }}"
docker_swarm_addr: "{{ hostvars[inventory_hostname]['ansible_' + docker_swarm_interface]['ipv4']['address'] }}"
docker_swarm_port: 2377

# No node labels set per default
docker_swarm_labels: {}
```

## Example playbook(s)

The first host in the `docker_swarm_managers` group will be initiated as the master node.

Any host declared in both groups will be configured as a manager and worker(or master
and worker if above is true).

In order to declare a node as both worker and manager, it has to be explicitly
declared in both `docker_swarm_managers` and `docker_swarm_workers` groups. _Unlike
the default behaviour from docker_, where a joining manager node will perform tasks,
if not `--availability=[drain|pause]` argument is given.

### Single node setup

```ini
# hosts file
[docker_swarm_managers]
host1

[docker_swarm_workers]
host1

[nodes]
host1
```

## Example playbook
```yaml
# playbook.yml
- name: Setup docker
hosts: managers
hosts: nodes
become: true
become_user: root
roles:
- docker
vars:
docker_home: "{{ inventory_dir }}/.certs/"
docker_tls_organization: "my_org"
docker_ce_version: "18.06"
docker_ce_version: "5:19.03"
```

### Multi node setup

A multi node setup only accepts a `docker_swarm_managers` group with an **odd**
host count. This is in line with Docker's recommendation([which you can read more
about here](https://docs.docker.com/engine/swarm/admin_guide/)).

```ini
# hosts file
[docker_swarm_managers]
manager1 # <-- Will be initiated as master node
manager2
manager3

[docker_swarm_workers]
worker1
worker2
manager3 # <-- A manager node accepting tasks

[nodes:children]
docker_swarm_managers
docker_swarm_workers
```

```yaml
# playbook.yml
- name: Setup docker swarm
hosts: nodes
become: true
become_user: root
roles:
- docker
vars:
docker_home: "{{ inventory_dir }}/.certs/"
docker_tls_organization: "my_org"
docker_ce_version: "5:19.03"
docker_enable_swarm: true
```

## Adding labels to swarm nodes

The playbook looks for a declared variable named `docker_swarm_labels` in order
to set swarm labels on a node.

`docker_swarm_labels` is expected to be defined as a dict.

For a given host, the value of the `docker_swarm_labels` variable will replace
_all_ of the node's current labels. As so; if a node had previously defined any
labels, running your playbook again but now with an undefined or empty
`docker_swarm_labels` variable would remove _all_ labels from that node.

```yml
# playbook.yml
- name: Setup docker swarm
hosts: nodes
become: true
become_user: root
roles:
- docker
vars:
docker_swarm_labels:
nodes: gets_this_label
```

## Converting "manager and worker" node to "manager only" node

Converting an already deployed "manager and worker" node to a "manager only" node
is done by removing the node from the `docker_swarm_workers` group.

Consider an initial deploy with a hosts file like:

```ini
# hosts file
[docker_swarm_managers]
host1

[docker_swarm_workers]
host1
worker1
```

Now changing hosts to what follows and then running your playbook again would set
the node as "manager only":

```ini
# hosts file
[docker_swarm_managers]
host1

[docker_swarm_workers]
worker1
```
7 changes: 6 additions & 1 deletion defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,16 @@ docker_hosts:
docker_storage_driver: "overlay2"
docker_python_version: "4.0.1"

docker_use_tls: true
docker_use_tls: "{{ 'docker_swarm_managers' in group_names }}"
docker_tls_organization: "Acme"
docker_tls_path: "/etc/docker/certs"
docker_tls_client_expires_after: "+52w"
docker_tls_client_common_name: "client"

docker_enable_swarm: true
docker_py_openssl_version: "19.0.0"

# These are only relevant when 'docker_enable_swarm' is true
docker_swarm_interface: "{{ ansible_default_ipv4['interface'] }}"
docker_swarm_addr: "{{ hostvars[inventory_hostname]['ansible_' + docker_swarm_interface]['ipv4']['address'] }}"
docker_swarm_port: 2377
35 changes: 35 additions & 0 deletions molecule/default/molecule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,8 @@ platforms:
- "/sys/fs/cgroup:/sys/fs/cgroup:ro"
command: /sbin/init
privileged: true
groups:
- docker_swarm_managers
- name: ubuntu18
image: ubuntu:18.04
cap_add:
Expand All @@ -22,11 +24,44 @@ platforms:
- "/sys/fs/cgroup:/sys/fs/cgroup:ro"
command: /sbin/init
privileged: true
groups:
- docker_swarm_workers
- name: extra_manager
image: ubuntu:18.04
cap_add:
- SYS_ADMIN
volume_mounts:
- "/sys/fs/cgroup:/sys/fs/cgroup:ro"
command: /sbin/init
privileged: true
groups:
- docker_swarm_managers
- docker_swarm_workers
- name: extra_manager2
image: ubuntu:18.04
cap_add:
- SYS_ADMIN
volume_mounts:
- "/sys/fs/cgroup:/sys/fs/cgroup:ro"
command: /sbin/init
privileged: true
groups:
- docker_swarm_managers

provisioner:
name: ansible
lint:
name: ansible-lint
inventory:
host_vars:
ubuntu16:
docker_swarm_labels:
one: manager
second: label
group_vars:
docker_swarm_workers:
docker_swarm_labels:
a: worker
scenario:
name: default
verifier:
Expand Down
133 changes: 131 additions & 2 deletions molecule/default/tests/test_default.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,12 @@
import json
import os

import testinfra.utils.ansible_runner

testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner(
runner = testinfra.utils.ansible_runner.AnsibleRunner(
os.environ["MOLECULE_INVENTORY_FILE"]
).get_hosts("all")
)
testinfra_hosts = runner.get_hosts("all")


def test_docker_running_and_enabled(host):
Expand All @@ -15,3 +17,130 @@ def test_docker_running_and_enabled(host):

def test_able_to_access_docker_without_root(host):
assert "docker" in host.user("ubuntu").groups


def test_docker_swarm_enabled(host):
swarm_state = json.loads(
host.check_output(
"docker info --format '{{json .Swarm.LocalNodeState}}'"
)
)
assert swarm_state == "active"


def test_docker_swarm_status(host):
swarm_info = json.loads(
host.check_output("docker info --format '{{json .Swarm}}'")
)
hostname = host.check_output("hostname -s")

if hostname in runner.get_hosts("docker_swarm_managers"):
msg = "Expected '%s' to be a manager" % hostname
assert swarm_info["ControlAvailable"], msg
assert swarm_info["Managers"] == 3
assert swarm_info["Nodes"] == 4
elif hostname in runner.get_hosts("docker_swarm_workers"):
msg = "Expected '%s' to be a worker" % hostname
assert not swarm_info["ControlAvailable"], msg
else:
assert False, "Unexpected hostname in swarm setup: %s" % hostname


def test_docker_manager_node_availability(host):
hostname = host.check_output("hostname -s")

def get_node_info():
cmd = "docker node inspect self --format '{{json .Spec}}'"
# Raises an AssertionError based on the return code of
# given command
return json.loads(host.check_output(cmd))

if hostname == "ubuntu18":
# Worker only node
try:
get_node_info()
except AssertionError:
assert hostname in runner.get_hosts("docker_swarm_workers")
assert hostname not in runner.get_hosts("docker_swarm_managers")

elif hostname in {"ubuntu16", "extra_manager2"}:
# Manager only node
node_info = get_node_info()
assert node_info["Role"] == "manager"
assert node_info["Availability"] == "drain"

elif hostname == "extra_manager":
# Manager and worker node
node_info = get_node_info()
assert node_info["Role"] == "manager"
assert node_info["Availability"] == "active"

else:
assert False, "Unexpected hostname in swarm setup: %s" % hostname


def test_docker_swarm_labels(host):
def get_labels(hostname):
cmd = "docker node inspect %s --format '{{json .Spec.Labels}}'"
return json.loads(host.check_output(cmd % hostname))

hostname = host.check_output("hostname -s")
if hostname in runner.get_hosts("docker_swarm_managers"):
assert get_labels("ubuntu16") == {"one": "manager", "second": "label"}
assert get_labels("ubuntu18") == {"a": "worker"}
assert get_labels("extra_manager") == {"a": "worker"}
assert get_labels("extra_manager2") == {}


def test_docker_daemon_json(host):
def parse_daemon_json():
filepath = "/etc/docker/daemon.json"
return json.loads(host.file(filepath).content_string)

hostname = host.check_output("hostname -s")
if hostname in runner.get_hosts("docker_swarm_managers"):
daemon_json = parse_daemon_json()
assert "hosts" in daemon_json
assert "storage-driver" in daemon_json
assert "tlsverify" in daemon_json and daemon_json["tlsverify"]
assert "tlscacert" in daemon_json
assert "tlscert" in daemon_json
assert "tlskey" in daemon_json

elif hostname in runner.get_hosts("docker_swarm_workers"):
daemon_json = parse_daemon_json()
assert "hosts" in daemon_json
assert "storage-driver" in daemon_json
assert "tlsverify" not in daemon_json
assert "tlscacert" not in daemon_json
assert "tlscert" not in daemon_json
assert "tlskey" not in daemon_json

else:
assert False, "Unexpected hostname in swarm setup: %s" % hostname


def test_docker_swarm_certificates(host):
hostname = host.check_output("hostname -s")
if hostname in runner.get_hosts("docker_swarm_managers"):
certdir = host.file("/etc/docker/certs/")
assert certdir.exists
assert certdir.is_directory
certs = {
"ca-key.pem",
"ca.csr",
"ca.pem",
"server-key.pem",
"server.csr",
"server-cert.pem",
"key.pem",
"cert.pem",
}
for cert in certs:
certfile = host.file("/etc/docker/certs/{}".format(cert))
assert certfile.exists
assert certfile.is_file
elif hostname in runner.get_hosts("docker_swarm_workers"):
assert not host.file("/etc/docker/certs/").exists
else:
assert False, "Unexpected hostname in swarm setup: %s" % hostname
Loading