bug fix for kubescaler and setup for flux-operator-hpa-ca (#11)

* bug fix for kubescaler and setup for flux-operator-hpa-ca * added a switch to choose between eks nodegroup and cloudformation * added watch events for listing nodes * implemented eksctl cluster for lammps with ca * added metrics for scalability experiments and flux/lammps experiments * experimental setup complete for lammps full and semi auto * linting and version bump
converged-computing · Aug 23, 2023 · 9ea5e9c · 9ea5e9c
1 parent c1008e6
commit 9ea5e9c
Show file tree

Hide file tree

Showing 42 changed files with 7,748 additions and 35 deletions.
diff --git a/.github/dev-requirements.txt b/.github/dev-requirements.txt
@@ -1,5 +1,5 @@
 pre-commit
-black
+black==23.3.0
 isort
 flake8
 pytest
diff --git a/.gitignore b/.gitignore
@@ -15,3 +15,5 @@ __pycache__
 *auth-config.yaml
 *kubeconfig.yaml
 *kubeconfig-*.yaml
+**/.DS_Store
+.vscode
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,6 +14,7 @@ and **Merged pull requests**. Critical items to know are:
 The versions coincide with releases on pip. Only major versions will be released as tags on Github.
 
 ## [0.0.x](https://github.com/converged-computing/kubescaler/tree/main) (0.0.x)
+ - extensive changes to aws client (thanks to @rajibhossen!) (0.0.15)
  - use api client with consistent token to associate nodes to cluster (0.0.14)
  - remove dependency on subprocess and kubectl (0.0.13)
  - default install should include all cloud deps (0.0.12)

diff --git a/examples/aws/README.md b/examples/aws/README.md
@@ -28,6 +28,7 @@ the vpc. We do small max sizes here since it's just a demo! This first example r
 $ pip install -e .[aws]
 $ pip install -e kubescaler[aws]
 ```
+
 ```bash
 # Test scale up in increments of 1 (up to 3) for c2-standard-8 (the default) just one iteration!
 $ python test-scale.py --increment 1 small-cluster --max-node-count 3 --min-node-count 0 --start-iter 0 --end-iter 1
@@ -38,3 +39,48 @@ $ python test-scale.py --increment 1 test-cluster --max-node-count 32 --min-node
 # Test scale down in increments of 2 (5 down to 1) for 10 iterations (default)
 $ python test-scale.py --increment 2 test-cluster --down --max-node-count 5 --down
 ```
+
+Arguments
+```console
+usage: test-scale.py [-h] [--outdir OUTDIR] [--experiment EXPERIMENT] [--start-iter START_ITER] [--end-iter ITERS] [--max-node-count MAX_NODE_COUNT] [--min-node-count MIN_NODE_COUNT] [--start-node-count START_NODE_COUNT] [--machine-type MACHINE_TYPE] [--eks-nodegroup] [--increment INCREMENT] [--down] [cluster_name]
+
+K8s Scaling Experiment Runner
+
+positional arguments:
+  cluster_name          Cluster name suffix
+
+optional arguments:
+  -h, --help                show this help message and exit
+  --outdir OUTDIR           output directory for results
+  --experiment EXPERIMENT   Experiment name (defaults to script name)
+  --start-iter START_ITER   start at this iteration
+  --end-iter ITERS          end at this iteration
+  --max-node-count MAX_NODE_COUNT   maximum node count
+  --min-node-count MIN_NODE_COUNT   minimum node count
+  --start-node-count START_NODE_COUNT   start at this many nodes and go up
+  --machine-type MACHINE_TYPE   AWS machine type
+  --increment INCREMENT     Increment by this value
+  --down                    Test scaling down
+  --eks-nodegroup           Include this to use eks managed nodegroup, otherwise, it'll use cloudformation stack
+```
+example
+```console
+python3 test-scale.py --increment 16 cluster-64-node --max-node-count 64 --min-node-count 0 --start-iter 0 --end-iter 5
+```
+## Metrics
+Several timings that this program tracks.
+
+| Metric              | Description |
+| :---------------- | :------ |
+| create_vpc_stack        |   Amount of time it takes to create a cloudformation vpc stack   |
+| new_cluster           |   Amount of time it takes to deploy a cluster using boto3 eks `create_cluster`   |
+| create_workers_stack    |  Amount of time it takes to create a eks nodegroup or cloudformation worker stacks (depending on the input you provided when creating the cluster)   |
+| wait_for_nodes |  If you specified initial number of nodes, then this will track how long it takes for kubernetes to get those nodes   |
+| create_cluster | Total aggregrated time for creating a cluster, this includes all the above metrics |
+| watch_for_nodes_in_aws | When we perform scaling up, we track three timings in parallel. This metrics track how longs it takes to show a node in the aws when we do scale up operation in the code |
+| wait_for_stack_updates | We perform scale up by updating the cloudformation stack or eks nodegroup, which update the desired size in the autoscaling group, so this updates takes time. we track this to know how long it takes for the cloudformation or eks nodegroup to complete its update |
+| wait_for_nodes_in_k8s | this metrics tracks how long it takes for the nodes to show up in kubernetes once we apply the scale up operation |
+| delete_workers_stack | This one tracks the time to delete a cloudformation stack or eks nodegroup |
+| _delete_cluster | this one's shows how much time it takes when we call `boto3.eks.delete_cluster()` |
+| delete_vpc | time to delete a cloudformation stack for vpc |
+| delete_cluster | Total time to tear down a cluster including the above deletion times |
diff --git a/examples/aws/test-scale.py b/examples/aws/test-scale.py
@@ -44,13 +44,22 @@ def get_parser():
     parser.add_argument(
         "--min-node-count", help="minimum node count", type=int, default=0
     )
+    # temporarily starting with 0 nodes
     parser.add_argument(
         "--start-node-count",
         help="start at this many nodes and go up",
         type=int,
-        default=1,
+        default=0,
+    )
+    parser.add_argument(
+        "--machine-type", help="AWS machine type", default="hpc6a.48xlarge"
+    )
+    parser.add_argument(
+        "--eks-nodegroup",
+        action="store_true",
+        help="set this to use eks nodegroup for instances, otherwise, it'll use cloudformation stack",
+        default=False,
     )
-    parser.add_argument("--machine-type", help="AWS machine type", default="m5.large")
     parser.add_argument(
         "--increment", help="Increment by this value", type=int, default=1
     )
@@ -93,7 +102,7 @@ def main():
     print(f"📛️ Experiment name is {experiment_name}")
 
     # Prepare an output directory, named by cluster
-    outdir = os.path.join(args.outdir, experiment_name, cluster_name)
+    outdir = os.path.join(args.outdir, experiment_name, args.machine_type, cluster_name)
     if not os.path.exists(outdir):
         print(f"📁️ Creating output directory {outdir}")
         os.makedirs(outdir)
@@ -110,6 +119,10 @@ def increase_by(node_count):
         if node_count + args.increment < args.max_node_count:
             return args.increment
 
+        # Temporary workaround to not exceed the max and only scale up by increment
+        if node_count + args.increment > args.max_node_count:
+            return 0
+
         # Otherwise, return the difference (the largest step we can take)
         return args.max_node_count - node_count
 
@@ -145,6 +158,7 @@ def decrease_by(node_count):
             machine_type=args.machine_type,
             min_nodes=args.min_node_count,
             max_nodes=args.max_node_count,
+            eks_nodegroup=args.eks_nodegroup,
         )
         # Load a result if we have it
         if os.path.exists(results_file):
@@ -196,6 +210,7 @@ def decrease_by(node_count):
             increment = next_increment(node_count)
 
         # Delete the cluster and clean up
+        print(f"⚔️ Deleting the cluster - {cluster_name}")
         cli.delete_cluster()
         print(json.dumps(cli.data, indent=4))
         cli.save(results_file)

diff --git a/examples/flux_operator_ca_hpa/README.md b/examples/flux_operator_ca_hpa/README.md
@@ -0,0 +1,89 @@
+# Setup Kubernetes Cluster with Cluster Autoscaling
+
+## Deploy the cluster
+This file creates/deletes/scales a EKS Cluster. The nodes are managed by both EKS Nodegroup and Cloudformation Stacks.
+
+```
+python3 k8s_cluster_operations.py -h
+positional arguments:
+  cluster_name          Cluster name suffix
+
+optional arguments:
+  -h, --help            show this help message and exit
+  --experiment EXPERIMENT
+    Experiment name (defaults to script name)
+
+  --node-count NODE_COUNT
+    starting node count of the cluster
+
+  --max-node-count MAX_NODE_COUNT
+    maximum node count
+
+  --min-node-count MIN_NODE_COUNT
+    minimum node count
+
+  --machine-type MACHINE_TYPE
+    AWS EC2 Instance types
+
+  --operation [{create,delete,scale}]
+    Define which operation you want to perform, If you want to scale, be sure to increase the NODE_COUNT. The cluster size will increase depending on the current instance size. if NODE_COUNT is less than the current, the cluster nodes will be scaled down.
+
+  --eks-nodegroup
+    Include this option to use eks nodegroup for instances, otherwise, it'll use cloudformation stack. EKS Nodegroup will automatically set tags in the aws autoscaling group so that cluster autoscaler can discover them.
+
+  --enable-cluster-autoscaler
+    Include this to enable cluster autoscaling. This will also create an OIDC provider for the cloud. be sure to take a note of the RoleARN that this script will print.
+```
+
+Example usage
+
+```console
+basicinsect:flux_operator_ca_hpa hossen1$ python3 k8s_cluster_operations.py --operation "create" --enable-cluster-autoscaler --eks-nodegroup
+📛️ Cluster name is kubernetes-flux-operator-hpa-ca-cluster
+⭐️ Creating the cluster sized 1 to 5...
+🥞️ Creating VPC stack and subnets...
+🥣️ Creating cluster...
+The status of nodegroup CREATING
+Waiting for kubernetes-flux-operator-hpa-ca-cluster-worker-group nodegroup...
+Setting Up the cluster OIDC Provider
+The cluster autoscaler Role ARN - arn:aws:iam::<account-id>:role/AmazonEKSClusterAutoscalerRole
+
+⏱️ Waiting for 1 nodes to be Ready...
+Time for kubernetes to get nodes - 5.082208871841431
+🦊️ Writing config file to kubeconfig-aws.yaml
+  Usage: kubectl --kubeconfig=kubeconfig-aws.yaml get nodes
+```
+
+## Set UP Cluster Autoscaler
+
+Be sure to change two things in this file [cluster-autoscaler-autodiscover.yaml](cluster-autoscaler/cluster-autoscaler-autodiscover.yaml)
+
+1. RoleARN  `arn:aws:iam::<account-id>:role/AmazonEKSClusterAutoscalerRole` in the service account portion
+2. Cluster Name - `kubernetes-flux-operator-hpa-ca-cluster` in the commnds of the cluster autoscaler.
+
+then apply the changes..
+```console
+kubectl --kubeconfig=kubeconfig-aws.yaml apply -f cluster-autoscaler/cluster-autoscaler-autodiscover.yaml
+```
+
+Verify cluster autoscaler is up
+```console
+$ kubectl --kubeconfig=kubeconfig-aws.yaml get pods -n kube-system
+NAME                                  READY   STATUS    RESTARTS   AGE
+aws-node-2dz6x                        1/1     Running   0          9h
+aws-node-pzwl9                        1/1     Running   0          9h
+cluster-autoscaler-747689d74b-6lkfk   1/1     Running   0          8h
+coredns-79df7fff65-q984f              1/1     Running   0          9h
+coredns-79df7fff65-tlkwc              1/1     Running   0          9h
+kube-proxy-8ch5x                      1/1     Running   0          9h
+kube-proxy-kq9ch                      1/1     Running   0          9h
+metrics-server-7db4fb59f9-qdp2c       1/1     Running   0          7h5m
+```
+
+This will print the logs. be sure that cluster autoscaler discovered the autoscaling group and working properly.
+```console
+kubectl --kubeconfig=kubeconfig-aws.yaml -n kube-system logs deploy/cluster-autoscaler
+```
+
+## Run application to collect metrics
+Follow this [ca_hpa_readme.md](README_CA_HPA.md) to see how to run a program that will collect metrics for horizontal pod autoscaling, cluster autoscaling
diff --git a/examples/flux_operator_ca_hpa/README_CA_HPA.md b/examples/flux_operator_ca_hpa/README_CA_HPA.md
@@ -0,0 +1,38 @@
+# Metrics Collections
+The purpose of this [file](application_ca_hpa_metrics.py) is to collect application and system metrics. The assumption is that, we have a kubernetes cluster with cluster autoscaling and horizontal pod autoscaling (HPA) enabled. The metrics and logs this file capture are -
+
+1. How long a POD is in pending state due to resource unavailability
+2. How long it takes to run the container once the pod is scheduled
+3. When does HPA take action by seeing the CPU Utilization
+4. When there's pending pod, how long it takes for cluster autoscaler to take action
+5. when does the cluster autoscaler add new nodes?
+6. when cluster autoscaler request for new nodes, how long it takes to get the nodes?
+7. when the load is decreased, how long it takes for HPA to scale down pods
+8. When there's no load, how long it takes for CA to remove nodes?
+9. when do the nodes are actually removed?
+
+We can answer the above questions and many more by collecting the metrics. This file will save the results in the data directory.
+
+Run the file following
+```console
+python3 application_ca_hpa_metrics.py -h
+usage: application_ca_hpa_metrics.py [-h] [--flux-namespace FLUX_NAMESPACE] [--autoscaler-namespace AUTOSCALER_NAMESPACE] [--hpa-namespace HPA_NAMESPACE] [--kubeconfig KUBECONFIG] [--outdir OUTDIR]
+
+Program to collect various metrics from kubernetes
+
+optional arguments:
+  -h, --help
+    show this help message and exit
+
+  --flux-namespace FLUX_NAMESPACE
+    Namespace of the flux operator
+
+  --autoscaler-namespace AUTOSCALER_NAMESPACE
+    Namespace of the cluster autoscaler
+
+  --hpa-namespace HPA_NAMESPACE
+    Namespace of the horizontal pod autoscaler
+
+  --kubeconfig KUBECONFIG
+     config file name, full path if the file is not in the current directory
+```
diff --git a/examples/flux_operator_ca_hpa/basic-minicluster-setup/README.md b/examples/flux_operator_ca_hpa/basic-minicluster-setup/README.md
@@ -0,0 +1,49 @@
+# Flux Operator Mini Cluster Setup
+
+## Basic Minicluster setup
+This setup assumes you already created kubernetes cluster with at least 1/2 Nodes by following the direction [here](../README.md)
+
+Create the flux-operator namespace and install the operator:
+
+```bash
+$ kubectl create namespace flux-operator
+$ kubectl apply -f operator-minicluster/basic-configs/flux-operator.yaml
+```
+
+
+```bash
+$ kubectl apply -f operator-minicluster/basic-configs/minicluster.yaml
+```
+
+You'll need to wait for the container to pull (status `ContainerCreating` to `Running`).
+At this point, wait until the containers go from creating to running.
+
+```bash
+$ kubectl get -n flux-operator pods
+NAME                  READY   STATUS    RESTARTS   AGE
+flux-sample-0-4wmmp   1/1     Running   0          6m50s
+flux-sample-1-mjj7b   1/1     Running   0          6m50s
+```
+
+## Flux Cluster for With LAMMPS Application
+
+For this setup, we can not use python api, because, currently, we need placement group for lammps and boto3 api lacks the support for providing `placement group` option. So, we will use `eksctl`. If you don't have `eksctl`, please install it first.
+
+```console
+eksctl create cluster -f operator-minicluster/hpc7g-configs/eks-efa-cluster-config-hpc7g.yaml
+```
+
+This will create a cluster with managed nodegroup, oidc provider, and service account for cluster autoscaler.
+
+Now deploy an arm version of the Flux Operator.
+```console
+kubectl apply -f operator-minicluster/hpc7g-configs/flux-operator-arm.yaml
+```
+
+This will create our size 1 cluster that we will be running LAMMPS on many times:
+```
+kubectl create namespace flux-operator
+kubectl apply -f operator-minicluster/hpc7g-configs/minicluster-libfabric-new.yaml # 18.1.1
+```
+
+More to follow...