A small set of convenient playbooks to assist in preparing K8s worker nodes to host an AIStore deployment. None of these are required. We use all of them in our reference environment, but you're free to make the filesystem as you wish, tune nodes as you wish, etc., in which case either ignore these or use them as a reference.
These playbooks have been tested only on Ubuntu hosts.
Each playbook is documented separately. See the links in the first column below.
Playbook(s) | Useful when |
---|---|
ais_enable_multiqueue | Enabling MQ IO schedulers in Ubuntu releases for which MQ is not the default |
ais_host_config_common | Tuning worker nodes; adding useful packages etc |
ais_datafs_mkfs | Creating or recreating filesystems for AIStore |
ais_host_post_kubespray | Using AIStore chart values that require "unsafe" sysctls; changes kubelet.env to enable them |
ais_gpuhost_config | Configuring GPU compute nodes in the same cluster - install NVIDIA Docker 2, NVIDIA container runtime, etc. |
ais_cluster_management | A collection of playbooks to deploy and upgrade AIS clusters on K8s. Cluster shut down and associated cleanup is also supported. |
The ais_host_config_common
playbook includes a tagging scheme to allow
more granular selection of tasks and to skip tasks that are likely site-specific.
The vars
directory includes variable definitions that control the playbooks,
split into multiple files with comments explaining which playbooks they control
(and which tags will use them).
The hosts-example.ini and ansible-example.cfg files are reference examples for constructing the actual hosts.ini and ansible.cfg files in the same path.
We run playbooks in the following order wrt other steps:
- Hosts are installed with Ubuntu 18.04 LTS; ssh, ansible etc bootstrapped.
- To enable MQ IO scheduler we run playbook
ais_enable_multiqueue
and reboot. - Next we run
ais_host_config_common
on all nodes (cpu and any gpu nodes). - Kubespray time - establish K8s cluster
- If we're install gpu worker nodes, playbook
ais_gpuhost_config
- Since we use sysctl somaxconn in containers we need to change
kubelet.env
and playbookais_host_post_kubespray
does this for us. - Make filesystems with
ais_datafs_mkfs
- Deploy AIS K8s operator
ais_deploy_operator
- Deploy AIS cluster
ais_deploy_cluster