diff --git a/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/README.md b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/README.md
new file mode 100644
index 0000000000..7f0c062080
--- /dev/null
+++ b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/README.md
@@ -0,0 +1,153 @@
+# A3-Ultra Slurm + Ubuntu + GCS
+
+This reference design creates a Slurm cluster with the following design:
+
+1. Ubuntu 22 Operating System
+1. A static a3-ultragpu-8g partition that uses a reservation.
+1. 3 VPCs (2x CPU, 1x for GPU RDMA networks), with a total of 9 subnetworks
+1. A GCS bucket that is configured with Hierarchical Namespace enabled
+1. Cloud Storage Fuse, configured to utilize Local-SSD storage
+
+## Deployment Instructions
+
+### Build the Cluster Toolkit gcluster binary
+
+Follow instructions
+[here](https://cloud.google.com/cluster-toolkit/docs/setup/configure-environment)
+
+### (Optional, but recommended) Create a GCS Bucket for storing terraform state
+
+```bash
+#!/bin/bash
+
+TF_STATE_BUCKET_NAME=<your-bucket>
+PROJECT_ID=<your-gcp-project>
+REGION=<your-preferred-region>
+
+gcloud storage buckets create gs://${TF_STATE_BUCKET_NAME} \
+    --project=${PROJECT_ID} \
+    --default-storage-class=STANDARD --location=${REGION} \
+    --uniform-bucket-level-access
+gcloud storage buckets update gs://${TF_STATE_BUCKET_NAME} --versioning
+```
+
+### Create and configure a GCS Bucket
+
+This will be used for input data and checkpoint/restart data. This bucket should
+be created with Hierarchical Namespace enabled. See
+[here](https://cloud.google.com/storage/docs/hns-overview) for more details.
+
+```bash
+#!/bin/bash
+PROJECT_ID=<your-gcp-project>
+REGION=<your-preferred-region>
+HNS_BUCKET_NAME=<training-bucket-name>
+PROJECT_NUMER=<your-project-number>
+
+gcloud storage buckets create gs://${HNS_BUCKET_NAME} \
+    --location=${REGION} --uniform-bucket-level-access
+    --enable-hierarchical-namespace
+
+```
+
+### Create/modify the deployment.yaml file with your preferred configuration
+
+For example, set the such as size, reservation to be used, etc, as well as the
+name of the bucket that you just created. Below is an example
+
+```yaml
+---
+terraform_backend_defaults:
+  type: gcs
+  configuration:
+    bucket: TF_STATE_BUCKET_NAME
+
+vars:
+  deployment_name: a3u-gcs
+  project_id: <PROJECT_ID>
+  region: <REGION>
+  zone: <ZONE>
+  a3u_reservation_name: <RESERVATION_NAME>
+  a3u_cluster_size: <RESERVATION_SIZE>
+  hns_gcs_bucket: <HNS_BUCKET_NAME> # This bucket must have been previously created
+
+```
+
+### Deploy the cluster
+
+```bash
+#!/bin/bash
+gcluster deploy -d deployment.yaml a3u-slurm-ubuntu-gcs.yaml
+```
+
+## Storage Design Components
+
+On the login and controller nodes, the gcs bucket is mounted at /gcs, using
+fairly standard [Cloud Storage Fuse configuration](https://cloud.google.com/storage/docs/cloud-storage-fuse/config-file). On the compute nodes, there are two
+mounts of the same bucket.  First, `/gcs` is mounted with with the following
+configuration:
+
+```yaml
+file-cache:
+  max-size-mb: -1
+  enable-parallel-downloads: true
+  download-chunk-size-mb: 50
+  parallel-downloads-per-file: 16
+cache-dir: /mnt/localssd
+file-system:
+  dir-mode: "777"
+  file-mode: "777"
+  rename-dir-limit: 20000  # Set to 20000 for hierarchical buckets
+  temp-dir: /mnt/localssd
+  fuse-options: allow_other
+foreground: true
+```
+
+This uses /mnt/localssd as a cache dir (for reads) and temp-dir (for writes).
+It also enables parallel downloads, which is particularly useful for
+checkpoint restarts.
+
+Next, `/gcs-ro` is mounted in a "read-only" mode, and optimized to for
+input (training) data reading.
+
+```yaml
+file-cache:
+  max-size-mb: -1
+metadata-cache:
+  ttl-secs: 3600  # Decrease if your data changes quickly.
+cache-dir: /mnt/localssd
+file-system:
+  dir-mode: "755" # need 5 on dir to enable ls
+  file-mode: "644"
+  temp-dir: /mnt/localssd
+  fuse-options: allow_other
+  kernel-list-cache-ttl-secs: 60
+foreground: true
+```
+
+The local ssds will be used for a file cache, and the metadata-cache
+for the data is set to 1 hour, with kernel-list-cache ttl set to 60 seconds.
+This reduces the amount of requests that will be sent to GCS, and improves
+data loading performance.
+
+We suggest using /gcs for checkpoint saving/loading. and use /gcs-ro for
+data input loading.
+
+## Running Benchmarks with Ramble
+
+To run a series of NCCL test benchmarks on your cluster, you can use
+the use the following script: `run-nccl-tests-via-ramble.sh`,
+which will use [ramble](https://github.com/GoogleCloudPlatform/ramble) to
+automate the building and running of nccl tests from 2 nodes up to 32 node
+scales.
+
+Copy the contents of `run-nccl-tests-via-ramble.sh` to your slurm
+login or controller node, for example:
+
+```bash
+#!/bin/bash
+wget -np -nd https://raw.githubusercontent.com/GoogleCloudPlatform/cluster-toolkit/refs/heads/develop/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/run-nccl-tests-via-ramble.sh
+```
+
+and then launch with `bash run-nccl-tests-via-ramble.sh`. The entire process
+will take ~30 minutes.
diff --git a/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/a3u-slurm-ubuntu-gcs.yaml b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/a3u-slurm-ubuntu-gcs.yaml
new file mode 100644
index 0000000000..7be9f89a00
--- /dev/null
+++ b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/a3u-slurm-ubuntu-gcs.yaml
@@ -0,0 +1,615 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+
+blueprint_name: a3u-slurm-ubuntu-gcs
+
+vars:
+  # The following are supplied through the deployment.yaml file.
+  deployment_name: # supply deployment name
+  project_id: # supply project ID
+  region: # supply region
+  zone: # supply zone
+  a3u_cluster_size: # supply cluster size
+  a3u_reservation_name: # supply reservation name
+  hns_gcs_bucket: # Name of HNS enabled GCS bucket
+  # End of variables defined by deployment.yaml. The remainder
+  # of this blueprint need not be modified.
+
+  # Image settings
+  base_image:
+    project: ubuntu-os-accelerator-images
+    family: ubuntu-accelerator-2204-amd64-with-nvidia-550
+  image_build_machine_type: n2-standard-16
+  build_slurm_from_git_ref: 6.8.6
+
+  # Cluster env settings
+  # net0 and filestore ranges must not overlap
+  net0_range: 192.168.0.0/19
+  filestore_ip_range: 192.168.32.0/24
+  net1_range: 192.168.64.0/18
+  rdma_net_range: 192.168.128.0/18
+
+  # Cluster Settings
+  local_ssd_mountpoint: /mnt/localssd
+  instance_image:
+    project: $(vars.project_id)
+    family: $(vars.deployment_name)-u22
+  disk_size_gb: 200
+  nccl_plugin_version: v1.0.2
+
+  # Here we define a set of startup script runners that are used to configure
+  # the controller node
+  controller_runners:
+  - type: shell
+    destination: stage_scripts.sh
+    content: |
+      #!/bin/bash
+      SLURM_ROOT=/opt/apps/adm/slurm
+      PARTITION_NAME=a3ultra
+      mkdir -m 0755 -p "${SLURM_ROOT}/scripts"
+      mkdir -p "${SLURM_ROOT}/partition-${PARTITION_NAME}-epilog_slurmd.d"
+      ln -s "/slurm/scripts/tools/gpu-test" "${SLURM_ROOT}/partition-${PARTITION_NAME}-epilog_slurmd.d/gpu-test.epilog_slurmd"
+
+  # Shared runners between login and controller:
+  # Configure an enroot config path
+  shared_runners:
+  - type: data
+    destination: /etc/enroot/enroot.conf
+    content: |
+      ENROOT_CONFIG_PATH     ${HOME}/.enroot
+
+  # Here we define a set of startup script runners that are used to configure
+  # the A3-Ultra nodes
+  # Set up enroot, using the local ssds for runtime/cache/data/temp storage.
+  a3u_runners:
+  - type: data
+    destination: /etc/enroot/enroot.conf
+    content: |
+      ENROOT_CONFIG_PATH     ${HOME}/.enroot
+      ENROOT_RUNTIME_PATH    $(vars.local_ssd_mountpoint)/${UID}/enroot/runtime
+      ENROOT_CACHE_PATH      $(vars.local_ssd_mountpoint)/${UID}/enroot/cache
+      ENROOT_DATA_PATH       $(vars.local_ssd_mountpoint)/${UID}/enroot/data
+      ENROOT_TEMP_PATH       $(vars.local_ssd_mountpoint)/${UID}/enroot
+
+  # Install NCCL Network Plugin
+  - type: ansible-local
+    destination: nccl_plugin.yml
+    content: |
+      ---
+      - name: Install NCCL plugin for A3 Ultra series
+        hosts: all
+        become: true
+        tasks:
+        - name: Add SystemD unit for NCCL plugin installation
+          ansible.builtin.copy:
+            dest: /etc/systemd/system/nccl-plugin@.service
+            mode: 0o0644
+            content: |
+              [Unit]
+              After=network-online.target
+              Before=slurmd.service
+
+              [Service]
+              Type=oneshot
+              ExecStartPre=/usr/bin/rm -rf /usr/local/gib
+              ExecStartPre=/usr/bin/mkdir -p /usr/local/gib
+              ExecStartPre=/snap/bin/gcloud auth configure-docker --quiet us-docker.pkg.dev
+              ExecStart=/usr/bin/docker run --rm --name nccl-gib-installer --volume /usr/local/gib:/var/lib/gib \
+                  us-docker.pkg.dev/gce-ai-infra/gpudirect-gib/nccl-plugin-gib:%i install --install-nccl
+
+              [Install]
+              WantedBy=slurmd.service
+          notify:
+          - Reload SystemD
+        handlers:
+        - name: Reload SystemD
+          ansible.builtin.systemd:
+            daemon_reload: true
+        post_tasks:
+        - name: Enable NCCL plugin SystemD unit
+          ansible.builtin.service:
+            name: nccl-plugin@$(vars.nccl_plugin_version).service
+            state: started
+            enabled: true
+
+  # Configure Cloud Storage FUSE
+  - type: ansible-local
+    destination: gcsfuse.yml
+    content: |
+      ---
+      - name: Create LSSD optimized gcsfuse mount
+        hosts: all
+        become: true
+        tasks:
+        - name: Create gcsfuse rwx configuration
+          ansible.builtin.copy:
+            dest: /etc/gcsfuse-lssd.yml
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              file-cache:
+                max-size-mb: -1
+                enable-parallel-downloads: true
+                download-chunk-size-mb: 50
+                parallel-downloads-per-file: 16
+              cache-dir: /mnt/localssd
+              file-system:
+                dir-mode: "777"
+                file-mode: "777"
+                rename-dir-limit: 20000  # Set to 20000 for hierarchical buckets
+                temp-dir: /mnt/localssd
+                fuse-options: allow_other
+              foreground: true
+
+        - name: Create gcsfuse read-only configuration for input data
+          ansible.builtin.copy:
+            dest: /etc/gcsfuse-ro.yml
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              file-cache:
+                max-size-mb: -1
+              metadata-cache:
+                ttl-secs: 3600  # Decrease if your data changes quickly.
+              cache-dir: /mnt/localssd
+              file-system:
+                dir-mode: "755" # need 5 on dir to enable ls
+                file-mode: "644"
+                temp-dir: /mnt/localssd
+                fuse-options: allow_other
+                kernel-list-cache-ttl-secs: 60
+              foreground: true
+
+        - name: Create gcsfuse systemd service
+          ansible.builtin.copy:
+            dest: /etc/systemd/system/gcsfuse-lssd.service
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              [Unit]
+              Description=gcsfuse mount of all buckets
+              After=local-fs.target
+
+              [Service]
+              Type=simple
+              User=root
+              ExecStartPre=/bin/mkdir -p /gcs
+              ExecStart=gcsfuse --config-file /etc/gcsfuse-lssd.yml $(vars.hns_gcs_bucket) /gcs
+              ExecStop=fusermount3 -u /gcs
+
+              [Install]
+              WantedBy=slurmd.service multi-user.target
+
+        - name: Create read-only gcsfuse systemd service
+          ansible.builtin.copy:
+            dest: /etc/systemd/system/gcsfuse-ro.service
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              [Unit]
+              Description=gcsfuse-ro mount
+              After=local-fs.target
+
+              [Service]
+              Type=simple
+              User=root
+              ExecStartPre=/bin/mkdir -p /gcs-ro
+              ExecStart=gcsfuse --config-file /etc/gcsfuse-ro.yml $(vars.hns_gcs_bucket) /gcs-ro
+              ExecStop=fusermount3 -u /gcs-ro
+
+              [Install]
+              WantedBy=slurmd.service multi-user.target
+
+        post_tasks:
+        - name: Enable and restart gcsfuse
+          ansible.builtin.service:
+            name: gcsfuse-lssd.service
+            state: restarted
+            enabled: true
+
+        - name: Enable and restart gcsfuse-ro
+          ansible.builtin.service:
+            name: gcsfuse-ro.service
+            state: restarted
+            enabled: true
+
+  # Configure Cloud Storage FUSE for login/controller nodes
+  gcsfuse_runners:
+  - type: ansible-local
+    destination: gcsfuse.yml
+    content: |
+      ---
+      - name: Create Standard RWX gcsfuse mount
+        hosts: localhost
+        become: true
+        tasks:
+        - name: Create gcsfuse configuration
+          ansible.builtin.copy:
+            dest: /etc/gcsfuse.yml
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              file-system:
+                dir-mode: "777"
+                file-mode: "777"
+                rename-dir-limit: 20000
+                fuse-options: allow_other
+              foreground: true
+
+        - name: Create gcsfuse systemd service
+          ansible.builtin.copy:
+            dest: /etc/systemd/system/gcsfuse.service
+            owner: root
+            group: root
+            mode: 0o644
+            content: |
+              [Unit]
+              Description=gcsfuse mount of all buckets
+              After=local-fs.target
+
+              [Service]
+              Type=simple
+              User=root
+              ExecStartPre=/bin/mkdir -p /gcs
+              ExecStart=gcsfuse --config-file /etc/gcsfuse.yml $(vars.hns_gcs_bucket) /gcs
+              ExecStop=fusermount3 -u /gcs
+
+              [Install]
+              WantedBy=slurmd.service multi-user.target
+
+        post_tasks:
+        - name: Enable and restart gcsfuse
+          ansible.builtin.service:
+            name: gcsfuse.service
+            state: restarted
+            enabled: true
+
+deployment_groups:
+- group: image-env
+  modules:
+  - id: slurm-image-network
+    source: modules/network/vpc
+
+  - id: slurm-build-script
+    source: modules/scripts/startup-script
+    settings:
+      install_ansible: true
+      docker:
+        enabled: true
+      runners:
+      - type: data
+        destination: /etc/cluster_toolkit/a3ultra-prod-slurm-image.yaml
+        source: ../.ghpc/artifacts/expanded_blueprint.yaml
+      - type: data
+        destination: /var/tmp/slurm_vars.json
+        content: |
+          {
+            "reboot": false,
+            "install_cuda": false,
+            "install_gcsfuse": true,
+            "install_lustre": false,
+            "install_ompi": true,
+            "update_kernel": false,
+            "monitoring_agent": "cloud-ops",
+          }
+      - type: shell
+        destination: install_slurm.sh
+        content: |
+          #!/bin/bash
+          set -e -o pipefail
+          ansible-pull \
+              -U https://github.com/GoogleCloudPlatform/slurm-gcp -C $(vars.build_slurm_from_git_ref) \
+              -i localhost, --limit localhost --connection=local \
+              -e @/var/tmp/slurm_vars.json \
+              ansible/playbook.yml
+            # this duplicates the ulimits configuration of the HPC VM Image
+      - type: data
+        destination: /etc/security/limits.d/99-unlimited.conf
+        content: |
+          * - memlock unlimited
+          * - nproc unlimited
+          * - stack unlimited
+          * - nofile 1048576
+          * - cpu unlimited
+          * - rtprio unlimited
+      - type: data
+        destination: /etc/systemd/system/slurmd.service.d/file_ulimit.conf
+        content: |
+          [Service]
+          LimitNOFILE=infinity
+      - type: data
+        destination: /etc/netplan/60-cloud-mrdma-init.yaml
+        content: |
+          network:
+            ethernets:
+              primary:
+                match:
+                  name: enp0s*
+                  driver: gve
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: true
+                dhcp6: true
+                dhcp6-overrides:
+                  use-domains: true
+                optional: true
+              secondary:
+                match:
+                  driver: gve
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: false
+                  use-dns: false
+                  use-ntp: false
+                dhcp6: true
+                dhcp6-overrides:
+                  use-domains: false
+                  use-dns: false
+                  use-ntp: false
+                optional: true
+              mrdma_devices:
+                match:
+                  driver: mlx5_core
+                dhcp-identifier: mac
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: true
+                  use-dns: false
+                  use-ntp: false
+                optional: true
+            version: 2
+      - type: ansible-local
+        destination: configure_gpu.yml
+        content: |
+          ---
+          - name: Install NVIDIA packages
+            hosts: all
+            become: true
+            vars:
+              distribution: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.','') }}"
+              cuda_repo_url: https://developer.download.nvidia.com/compute/cuda/repos/{{ distribution }}/x86_64/cuda-keyring_1.1-1_all.deb
+              cuda_repo_filename: /tmp/{{ cuda_repo_url | basename }}
+              enable_nvidia_dcgm: false
+              nvidia_packages:
+              - cuda-toolkit-12-4
+              - datacenter-gpu-manager
+              - libnvidia-nscq-550
+            tasks:
+            - name: Download NVIDIA repository package
+              ansible.builtin.get_url:
+                url: "{{ cuda_repo_url }}"
+                dest: "{{ cuda_repo_filename }}"
+            - name: Install NVIDIA repository package
+              ansible.builtin.apt:
+                deb: "{{ cuda_repo_filename }}"
+                state: present
+            - name: Reduce NVIDIA repository priority
+              ansible.builtin.copy:
+                dest: /etc/apt/preferences.d/cuda-repository-pin-600
+                mode: 0o0644
+                owner: root
+                group: root
+                content: |
+                  Package: nsight-compute
+                  Pin: origin *ubuntu.com*
+                  Pin-Priority: -1
+
+                  Package: nsight-systems
+                  Pin: origin *ubuntu.com*
+                  Pin-Priority: -1
+
+                  Package: *
+                  Pin: release l=NVIDIA CUDA
+                  Pin-Priority: 400
+            - name: Install NVIDIA fabric and CUDA
+              ansible.builtin.apt:
+                name: "{{ item }}"
+                update_cache: true
+              loop: "{{ nvidia_packages }}"
+            - name: Freeze NVIDIA fabric and CUDA
+              ansible.builtin.dpkg_selections:
+                name: "{{ item }}"
+                selection: hold
+              loop: "{{ nvidia_packages }}"
+            post_tasks:
+            - name: Disable NVIDIA DCGM by default (enable during boot on GPU nodes)
+              ansible.builtin.service:
+                name: nvidia-dcgm.service
+                state: stopped
+                enabled: false
+      - type: ansible-local
+        destination: install_mellanox_drivers.yml
+        content: |
+          ---
+          - name: Update Netplan and Install Network Utils
+            hosts: all
+            become: true
+            tasks:
+            - name: Install Linux Modules Extra
+              ansible.builtin.package:
+                name:
+                - ibverbs-utils
+                state: present
+            - name: Apply netplan
+              ansible.builtin.command: netplan apply
+
+- group: image
+  modules:
+  - id: slurm-a3ultra-image
+    source: modules/packer/custom-image
+    kind: packer
+    settings:
+      disk_size: $(vars.disk_size_gb)
+      machine_type: $(vars.image_build_machine_type)
+      source_image_family: $(vars.base_image.family)
+      source_image_project_id: [$(vars.base_image.project)]
+      image_family: $(vars.instance_image.family)
+      omit_external_ip: false
+    use:
+    - slurm-image-network
+    - slurm-build-script
+
+- group: cluster-env
+  modules:
+  - id: a3ultra-slurm-net-0
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-0
+      mtu: 8896
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-0
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net0_range)
+
+  - id: a3ultra-slurm-net-1
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-1
+      mtu: 8896
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-1
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net1_range)
+
+  - id: a3ultra-slurm-rdma-net
+    source: modules/network/gpu-rdma-vpc
+    settings:
+      network_name: $(vars.deployment_name)-rdma-net
+      network_profile: https://www.googleapis.com/compute/beta/projects/$(vars.project_id)/global/networkProfiles/$(vars.zone)-vpc-roce
+      network_routing_mode: REGIONAL
+      nic_type: MRDMA
+      subnetworks_template:
+        name_prefix: $(vars.deployment_name)-mrdma-sub
+        count: 8
+        ip_range: $(vars.rdma_net_range)
+        region: $(vars.region)
+
+  - id: homefs
+    source: modules/file-system/filestore
+    use:
+    - a3ultra-slurm-net-0
+    settings:
+      filestore_tier: HIGH_SCALE_SSD
+      size_gb: 10240
+      local_mount: /home
+      reserved_ip_range: $(vars.filestore_ip_range)
+      deletion_protection:
+        enabled: true
+        reason: Avoid data loss
+    outputs:
+    - network_storage
+
+- group: cluster
+  modules:
+  - id: a3ultra_startup
+    source: modules/scripts/startup-script
+    settings:
+      local_ssd_filesystem:
+        mountpoint: $(vars.local_ssd_mountpoint)
+        permissions: "1777" # must quote numeric filesystem permissions!
+      docker:
+        enabled: true
+        world_writable: true
+        daemon_config: |
+          {
+            "data-root": "$(vars.local_ssd_mountpoint)/docker"
+          }
+      runners: $(flatten([vars.a3u_runners]))
+
+  - id: a3_ultra_nodeset
+    source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset
+    use: [a3ultra-slurm-net-0, a3ultra_startup]
+    settings:
+      bandwidth_tier: gvnic_enabled
+      machine_type: a3-ultragpu-8g
+      instance_image_custom: true
+      enable_public_ips: true
+      node_count_static: $(vars.a3u_cluster_size)
+      node_count_dynamic_max: 0
+      enable_placement: false
+      disk_type: hyperdisk-balanced
+      on_host_maintenance: TERMINATE
+      reservation_name: $(vars.a3u_reservation_name)
+      additional_networks:
+        $(concat(
+          [{
+            network=null,
+            subnetwork=a3ultra-slurm-net-1.subnetwork_self_link,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip="",
+            stack_type=null,
+            access_config=[],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          }],
+          a3ultra-slurm-rdma-net.subnetwork_interfaces
+        ))
+
+  - id: a3_ultra_partition
+    source: community/modules/compute/schedmd-slurm-gcp-v6-partition
+    use:
+    - a3_ultra_nodeset
+    settings:
+      exclusive: false
+      partition_name: a3ultra
+      is_default: true
+      partition_conf:
+        ResumeTimeout: 900
+        SuspendTimeout: 600
+        OverSubscribe: EXCLUSIVE
+
+  - id: controller_startup
+    source: modules/scripts/startup-script
+    settings:
+      runners: $(flatten([vars.shared_runners, vars.controller_runners, vars.gcsfuse_runners]))
+
+  - id: login_startup
+    source: modules/scripts/startup-script
+    settings:
+      runners: $(flatten([vars.shared_runners, vars.gcsfuse_runners]))
+
+  - id: slurm_login
+    source: community/modules/scheduler/schedmd-slurm-gcp-v6-login
+    use: [a3ultra-slurm-net-0]
+    settings:
+      instance_image_custom: true
+      disk_size_gb: 300
+      enable_login_public_ips: true
+      machine_type: n2-standard-8
+
+  - id: slurm_controller
+    source: community/modules/scheduler/schedmd-slurm-gcp-v6-controller
+    use:
+    - a3ultra-slurm-net-0
+    - a3_ultra_partition
+    - slurm_login
+    - homefs
+    settings:
+      enable_controller_public_ips: true
+      instance_image_custom: true
+      disk_type: pd-extreme
+      disk_size_gb: 300
+      machine_type: n2-standard-80
+      controller_startup_script: $(controller_startup.startup_script)
+      login_startup_script: $(login_startup.startup_script)
+      enable_external_prolog_epilog: true
diff --git a/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/deployment.yaml b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/deployment.yaml
new file mode 100644
index 0000000000..d955eda1f4
--- /dev/null
+++ b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/deployment.yaml
@@ -0,0 +1,31 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+# If using GCS as a terraform backend (suggested), add the following.  If not,
+# comment out or remove.
+terraform_backend_defaults:
+  type: gcs
+  configuration:
+    bucket:  # Name of terraform state bucket.
+# End of optional section
+
+vars:
+  deployment_name:  # Unique name of this Cluster Toolkit Deployment, e.g. a3u-gcs
+  project_id:  # Your GCP project name
+  region:  # e.g. europe-west1
+  zone:  # e.g. europe-west1-b
+  a3u_reservation_name:  # reservation name, e.g. a3u-reservation-00
+  a3u_cluster_size:  # Number of A3-Ultra nodes in the cluster
+  hns_gcs_bucket:  # This bucket must have been previously created
diff --git a/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/run-nccl-tests-via-ramble.sh b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/run-nccl-tests-via-ramble.sh
new file mode 100644
index 0000000000..62061533f3
--- /dev/null
+++ b/examples/hypercompute_clusters/a3u-slurm-ubuntu-gcs/run-nccl-tests-via-ramble.sh
@@ -0,0 +1,224 @@
+#!/bin/bash
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+set -eu
+
+trap "printf '\nCaught Ctrl+c. Exiting...\n'; exit" INT
+
+# Use current unix timestamp as a unique tag
+# for jobs submitted
+TAG=$(date +%s)
+TEST_DIR=nccl-tests-"${TAG}"
+SOFTWARE_INSTALL=/opt/apps
+
+cat <<EOF
+This script will install the following packages using on this VM:
+  build-essential
+  g++-12
+  gcc-12
+  gfortran-12
+  jq
+  libgcc-12-dev
+  libgfortran-12-dev
+  libopenmpi-dev
+  openmpi-bin
+  python3-venv
+
+And will clone spack (https://github.com/spack/spack.git)
+and ramble (https://github.com/GoogleCloudPlatform/ramble.git)
+to "${SOFTWARE_INSTALL}"/. Afterwards it will create a ramble workspace to run a
+number of NCCL tests in $(readlink -f "${TEST_DIR}"/). As part of the build
+process, spack will add some configuration files to your "${HOME}"/.spack
+directory.
+
+EOF
+read -rp "To continue, hit any key. To cancel, [Ctrl-c]"
+
+mkdir -p "${TEST_DIR}"
+
+# Install prerequisites
+sudo apt-get install -y g++-12 gfortran-12 build-essential gcc-12 libgfortran-12-dev libgcc-12-dev python3-venv jq libopenmpi-dev openmpi-bin
+
+# Install ramble and spack, and make world read/writeable.
+sudo git clone --depth 1 -c feature.manyFiles=true https://github.com/GoogleCloudPlatform/ramble.git "${SOFTWARE_INSTALL}"/ramble || true
+sudo git clone --depth 1 -c feature.manyFiles=true -b develop https://github.com/spack/spack.git "${SOFTWARE_INSTALL}"/spack || true
+sudo chmod -R a+w "${SOFTWARE_INSTALL}"/{ramble,spack}
+
+# Create python environment for ramble, and install requirements
+python3 -m venv "${SOFTWARE_INSTALL}"/ramble/env || true
+source "${SOFTWARE_INSTALL}"/ramble/env/bin/activate
+pip install -q -r "${SOFTWARE_INSTALL}"/ramble/requirements.txt
+
+# Activate ramble and spack
+. ${SOFTWARE_INSTALL}/ramble/share/ramble/setup-env.sh
+. ${SOFTWARE_INSTALL}/spack/share/spack/setup-env.sh
+
+# Set up Spack external packages
+spack external find python diffutils xz ncurses flex curl openssl m4 openssh
+spack external find -p /usr/local/cuda cuda
+
+# Create a new workspace for this work
+ramble workspace create -a -d "${TEST_DIR}"
+
+# Populate ramble.yaml
+cat <<EOF >"${TEST_DIR}"/configs/ramble.yaml
+# Ramble Configuration for NCCL Tests
+ramble:
+  env_vars:
+    set:
+      OMPI_MCA_pml: "^ucx"
+      OMPI_MCA_btl: "^openib"
+      OMPI_MCA_btl_tcp_if_include: enp0s19
+
+      CUDA_VISIBLE_DEVICES: 0,1,2,3,4,5,6,7
+      NCCL_NET: gIB
+      NCCL_SOCKET_IFNAME: enp0s19,enp192s20
+      NCCL_CROSS_NIC: 0
+      NCCL_NET_GDR_LEVEL: PIX
+      NCCL_P2P_NET_CHUNKSIZE: 131072
+      NCCL_P2P_PCI_CHUNKSIZE: 131072
+      NCCL_P2P_NVL_CHUNKSIZE: 524288
+      NCCL_NVLS_CHUNKSIZE: 524288
+      NCCL_IB_GID_INDEX: 3
+      NCCL_IB_ADAPTIVE_ROUTING: 1
+      NCCL_IB_QPS_PER_CONNECTION: 4
+      NCCL_IB_TC: 52
+      NCCL_IB_FIFO_TC: 84
+      NCCL_SHIMNET_GUEST_CONFIG_CHECKER_CONFIG_FILE: /usr/local/gib/configs/guest_config.txtpb
+      NCCL_TUNER_CONFIG_PATH: /usr/local/gib/configs/tuner_config.txtpb
+    prepend:
+    - paths:
+        LD_LIBRARY_PATH: /usr/local/gib/lib64
+
+  variables:
+    mpi_command: srun --mpi=pmix
+    batch_submit: 'sbatch {execute_experiment}'
+    processes_per_node: '{gpus_per_node}'
+    gpus_per_node: '8'
+  applications:
+    nccl-tests:
+      workloads:
+        '{workload}':
+          experiments:
+            '{workload}-{n_nodes}':
+              variants:
+                package_manager: spack
+              variables:
+                workload: [all-gather, all-reduce, reduce-scatter]
+                n_nodes: [2, 4, 8, 16, 32]
+              matrix:
+              - n_nodes
+              - workload
+
+  software:
+    packages:
+      pmix:
+        pkg_spec: pmix
+      mpi:
+        pkg_spec: openmpi +cuda cuda_arch=90
+      cuda:
+        pkg_spec: cuda@12.4.0
+      nccl:
+        pkg_spec: nccl@2.23.4-1 cuda_arch=90
+      nccl-tests:
+        pkg_spec: nccl-tests cuda_arch=90
+    environments:
+      nccl-tests:
+        packages: [cuda, mpi, nccl, nccl-tests, pmix]
+
+EOF
+
+# Populate slurm sbatch script
+cat <<EOF >"${TEST_DIR}"/configs/execute_experiment.tpl
+#!/bin/bash
+#SBATCH -J {experiment_name}-"${TAG}"
+#SBATCH --output={experiment_run_dir}/slurm-%j.out
+#SBATCH -N {n_nodes}
+#SBATCH --gpus-per-node=8
+#SBATCH --exclusive
+#SBATCH --ntasks-per-node={processes_per_node}
+
+cd "{experiment_run_dir}"
+{command}
+EOF
+
+# Get number of nodes available
+N_NODES=$(sinfo -h -o %D)
+
+# Print available benchmarks
+printf "\n--------- Setting up Benchmarks ----------\n"
+ramble workspace info --where '{n_nodes} <= '"$N_NODES"
+
+printf "\n------- About to run the following: ------\n\n"
+printf "source %s/ramble/env/bin/activate\n" "${SOFTWARE_INSTALL}"
+printf ". %s/ramble/share/ramble/setup-env.sh\n" "${SOFTWARE_INSTALL}"
+printf ". %s/spack/share/spack/setup-env.sh\n" "${SOFTWARE_INSTALL}"
+printf "ramble workspace activate %s\n" "${TEST_DIR}"
+printf "ramble workspace setup --where '{n_nodes} <= %s'\n" "${N_NODES}"
+printf "ramble on --where '{n_nodes} <= %s' \n" "${N_NODES}"
+
+# Set up experiments
+printf "\n--------- Setting up Benchmarks -------\n"
+printf "         This may take 20-30 minutes     \n"
+ramble workspace setup --where '{n_nodes} <= '"${N_NODES}"
+
+# Submit Experiments to Slurm
+printf "\n----------- Running Benchmarks --------\n"
+ramble on --where '{n_nodes} <= '"${N_NODES}"
+
+# Wait for all to be done
+# Use the TAG in the slurm jobs
+until [[ $(squeue -h -o %j | grep -c "${TAG}") -eq 0 ]]; do
+	clear
+	echo "waiting for $(squeue -h -o %j | grep -c "${TAG}") jobs to finish"
+	squeue
+	sleep 5
+done
+
+# Analyze
+ramble workspace analyze -f json --where '{n_nodes} <= '"${N_NODES}"
+
+# Summarize all results in summary.tsv
+cd "${TEST_DIR}"
+jq -r '["workload","n_nodes","msg_size","busbw"], (.experiments[] as $exp | $exp.CONTEXTS[] as $context |
+{
+  experiment_name: $exp.name,
+  workload: $exp.workload_name,
+  n_nodes: $exp.n_nodes,
+  Context: $context.name
+} +
+($context.foms | from_entries )
+| [.workload, .n_nodes, .Size, ."Out of Place Bus Bandwidth"])
+| @tsv' results.latest.json >summary.tsv
+
+# Print just the 8GB message sizes
+printf "\n--- SUMMARY for 8GB Message Sizes --\n"
+jq -r '["workload","n_nodes","msg_size","busbw"], (.experiments[] as $exp | $exp.CONTEXTS[] as $context |
+{
+  experiment_name: $exp.name,
+  workload: $exp.workload_name,
+  n_nodes: $exp.n_nodes,
+  Context: $context.name
+} +
+($context.foms | from_entries )
+| select(.Size | tonumber  > 8000000000)
+| [.workload, .n_nodes, .Size, ."Out of Place Bus Bandwidth"])
+| @tsv' results.latest.json
+printf "\nFor full results, see \"summary.tsv\"\n"
+
+printf "\n- To reactivate this ramble workspace, run -\n\n"
+printf "source %s/ramble/env/bin/activate\n" "${SOFTWARE_INSTALL}"
+printf ". %s/ramble/share/ramble/setup-env.sh\n" "${SOFTWARE_INSTALL}"
+printf ". %s/spack/share/spack/setup-env.sh\n" "${SOFTWARE_INSTALL}"
+printf "ramble workspace activate %s\n" "${TEST_DIR}"
diff --git a/examples/machine-learning/a3-ultragpu-8g/README.md b/examples/machine-learning/a3-ultragpu-8g/README.md
new file mode 100644
index 0000000000..dfa3bb17c5
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/README.md
@@ -0,0 +1,16 @@
+# A3 Ultra Blueprints
+
+For further information on deploying an A3 Ultra cluster with Slurm, please
+see:
+
+[Create A3 Ultra Slurm Cluster](https://cloud.google.com/ai-hypercomputer/docs/create/create-slurm-cluster)
+
+If you are unable to access these documents, please contact your
+[Technical Account Manager (TAM)](https://cloud.google.com/tam).
+
+## Deploy A3 Ultra compute VM with custom startup-scripts
+
+Customers can deploy [a3ultra-vm.yaml] blueprint to deploy 2 A3 Ultra VMs. You
+can also specify custom startup-scripts to run in the blueprint.
+
+[a3ultra-vm.yaml]: ./a3ultra-vm.yaml
diff --git a/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-blueprint.yaml b/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-blueprint.yaml
new file mode 100644
index 0000000000..29b08add88
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-blueprint.yaml
@@ -0,0 +1,451 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+# This blueprint uses private preview functionality in limited availability,
+# see README.md for further information
+
+# This blueprint requires a Cluster Toolkit binary built from a
+# release >= 1.44.0
+
+blueprint_name: a3ultra-slurm
+
+vars:
+  deployment_name: # supply deployment name
+  project_id: # supply project ID
+  region: # supply region
+  zone: # supply zone
+  a3u_cluster_size: # supply cluster size
+  a3u_reservation_name: # supply reservation name
+  # Image settings
+  base_image:
+    project: ubuntu-os-accelerator-images
+    family: ubuntu-accelerator-2204-amd64-with-nvidia-550
+  image_build_machine_type: n2-standard-16
+  build_slurm_from_git_ref: 6.8.7
+  # Cluster env settings
+  # net0 and filestore ranges must not overlap
+  net0_range: 192.168.0.0/19
+  filestore_ip_range: 192.168.32.0/24
+  net1_range: 192.168.64.0/18
+  rdma_net_range: 192.168.128.0/18
+  # Cluster Settings
+  local_ssd_mountpoint: /mnt/localssd
+  instance_image:
+    project: $(vars.project_id)
+    family: $(vars.deployment_name)-u22
+  disk_size_gb: 200
+  nccl_plugin_version: v1.0.2
+
+deployment_groups:
+- group: image-env
+  modules:
+  - id: slurm-image-network
+    source: modules/network/vpc
+
+  - id: slurm-build-script
+    source: modules/scripts/startup-script
+    settings:
+      install_ansible: true
+      docker:
+        enabled: true
+      runners:
+      - type: data
+        destination: /etc/cluster_toolkit/a3ultra-prod-slurm-image.yaml
+        source: ../.ghpc/artifacts/expanded_blueprint.yaml
+      - type: data
+        destination: /var/tmp/slurm_vars.json
+        content: |
+          {
+            "reboot": false,
+            "install_cuda": false,
+            "install_gcsfuse": true,
+            "install_lustre": false,
+            "install_ompi": true,
+            "update_kernel": false,
+            "monitoring_agent": "cloud-ops",
+          }
+      - type: shell
+        destination: install_slurm.sh
+        content: |
+          #!/bin/bash
+          set -e -o pipefail
+          ansible-pull \
+              -U https://github.com/GoogleCloudPlatform/slurm-gcp -C $(vars.build_slurm_from_git_ref) \
+              -i localhost, --limit localhost --connection=local \
+              -e @/var/tmp/slurm_vars.json \
+              ansible/playbook.yml
+            # this duplicates the ulimits configuration of the HPC VM Image
+      - type: data
+        destination: /etc/security/limits.d/99-unlimited.conf
+        content: |
+          * - memlock unlimited
+          * - nproc unlimited
+          * - stack unlimited
+          * - nofile 1048576
+          * - cpu unlimited
+          * - rtprio unlimited
+      - type: data
+        destination: /etc/systemd/system/slurmd.service.d/file_ulimit.conf
+        content: |
+          [Service]
+          LimitNOFILE=infinity
+      - type: data
+        destination: /etc/netplan/60-cloud-mrdma-init.yaml
+        content: |
+          network:
+            ethernets:
+              primary:
+                match:
+                  name: enp0s*
+                  driver: gve
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: true
+                dhcp6: true
+                dhcp6-overrides:
+                  use-domains: true
+                optional: true
+              secondary:
+                match:
+                  driver: gve
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: false
+                  use-dns: false
+                  use-ntp: false
+                dhcp6: true
+                dhcp6-overrides:
+                  use-domains: false
+                  use-dns: false
+                  use-ntp: false
+                optional: true
+              mrdma_devices:
+                match:
+                  driver: mlx5_core
+                dhcp-identifier: mac
+                dhcp4: true
+                dhcp4-overrides:
+                  use-domains: true
+                  use-dns: false
+                  use-ntp: false
+                optional: true
+            version: 2
+      - type: ansible-local
+        destination: configure_gpu.yml
+        content: |
+          ---
+          - name: Install NVIDIA packages
+            hosts: all
+            become: true
+            vars:
+              distribution: "{{ ansible_distribution | lower }}{{ ansible_distribution_version | replace('.','') }}"
+              cuda_repo_url: https://developer.download.nvidia.com/compute/cuda/repos/{{ distribution }}/x86_64/cuda-keyring_1.1-1_all.deb
+              cuda_repo_filename: /tmp/{{ cuda_repo_url | basename }}
+              enable_nvidia_dcgm: false
+              nvidia_packages:
+              - cuda-toolkit-12-4
+              - datacenter-gpu-manager
+              - libnvidia-nscq-550
+            tasks:
+            - name: Download NVIDIA repository package
+              ansible.builtin.get_url:
+                url: "{{ cuda_repo_url }}"
+                dest: "{{ cuda_repo_filename }}"
+            - name: Install NVIDIA repository package
+              ansible.builtin.apt:
+                deb: "{{ cuda_repo_filename }}"
+                state: present
+            - name: Reduce NVIDIA repository priority
+              ansible.builtin.copy:
+                dest: /etc/apt/preferences.d/cuda-repository-pin-600
+                mode: 0o0644
+                owner: root
+                group: root
+                content: |
+                  Package: nsight-compute
+                  Pin: origin *ubuntu.com*
+                  Pin-Priority: -1
+
+                  Package: nsight-systems
+                  Pin: origin *ubuntu.com*
+                  Pin-Priority: -1
+
+                  Package: *
+                  Pin: release l=NVIDIA CUDA
+                  Pin-Priority: 400
+            - name: Install NVIDIA fabric and CUDA
+              ansible.builtin.apt:
+                name: "{{ item }}"
+                update_cache: true
+              loop: "{{ nvidia_packages }}"
+            - name: Freeze NVIDIA fabric and CUDA
+              ansible.builtin.dpkg_selections:
+                name: "{{ item }}"
+                selection: hold
+              loop: "{{ nvidia_packages }}"
+            post_tasks:
+            - name: Disable NVIDIA DCGM by default (enable during boot on GPU nodes)
+              ansible.builtin.service:
+                name: nvidia-dcgm.service
+                state: stopped
+                enabled: false
+      - type: ansible-local
+        destination: install_mellanox_drivers.yml
+        content: |
+          ---
+          - name: Update Netplan and Install Network Utils
+            hosts: all
+            become: true
+            tasks:
+            - name: Install Linux Modules Extra
+              ansible.builtin.package:
+                name:
+                - ibverbs-utils
+                state: present
+            - name: Apply netplan
+              ansible.builtin.command: netplan apply
+
+- group: image
+  modules:
+  - id: slurm-a3ultra-image
+    source: modules/packer/custom-image
+    kind: packer
+    settings:
+      disk_size: $(vars.disk_size_gb)
+      machine_type: $(vars.image_build_machine_type)
+      source_image_family: $(vars.base_image.family)
+      source_image_project_id: [$(vars.base_image.project)]
+      image_family: $(vars.instance_image.family)
+      omit_external_ip: false
+    use:
+    - slurm-image-network
+    - slurm-build-script
+
+- group: cluster-env
+  modules:
+  - id: a3ultra-slurm-net-0
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-0
+      mtu: 8896
+      enable_internal_traffic: false # Setting firewall below instead
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-0
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net0_range)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-0
+        ranges: [$(vars.net0_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: a3ultra-slurm-net-1
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-1
+      mtu: 8896
+      enable_internal_traffic: false # Setting firewall below instead
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-1
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net1_range)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-1
+        ranges: [$(vars.net1_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: a3ultra-slurm-rdma-net
+    source: modules/network/gpu-rdma-vpc
+    settings:
+      network_name: $(vars.deployment_name)-rdma-net
+      network_profile: https://www.googleapis.com/compute/beta/projects/$(vars.project_id)/global/networkProfiles/$(vars.zone)-vpc-roce
+      network_routing_mode: REGIONAL
+      subnetworks_template:
+        name_prefix: $(vars.deployment_name)-mrdma-sub
+        count: 8
+        ip_range: $(vars.rdma_net_range)
+        region: $(vars.region)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-rdma
+        ranges: [$(vars.rdma_net_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: homefs
+    source: modules/file-system/filestore
+    use:
+    - a3ultra-slurm-net-0
+    settings:
+      filestore_tier: HIGH_SCALE_SSD
+      size_gb: 10240
+      local_mount: /home
+      reserved_ip_range: $(vars.filestore_ip_range)
+      deletion_protection:
+        enabled: true
+        reason: Avoid data loss
+    outputs:
+    - network_storage
+
+- group: cluster
+  modules:
+  - id: a3ultra_startup
+    source: modules/scripts/startup-script
+    settings:
+      local_ssd_filesystem:
+        mountpoint: $(vars.local_ssd_mountpoint)
+        permissions: "1777" # must quote numeric filesystem permissions!
+      docker:
+        enabled: true
+        world_writable: true
+        daemon_config: |
+          {
+            "data-root": "$(vars.local_ssd_mountpoint)/docker"
+          }
+      runners:
+      - type: data
+        destination: /etc/enroot/enroot.conf
+        content: |
+          ENROOT_RUNTIME_PATH    $(vars.local_ssd_mountpoint)/${UID}/enroot/runtime
+          ENROOT_CACHE_PATH      $(vars.local_ssd_mountpoint)/${UID}/enroot/cache
+          ENROOT_DATA_PATH       $(vars.local_ssd_mountpoint)/${UID}/enroot/data
+          ENROOT_TEMP_PATH       $(vars.local_ssd_mountpoint)/${UID}/enroot
+      - type: ansible-local
+        destination: nccl_plugin.yml
+        content: |
+          ---
+          - name: Install NCCL plugin for A3 Ultra series
+            hosts: all
+            become: true
+            tasks:
+            - name: Add SystemD unit for NCCL plugin installation
+              ansible.builtin.copy:
+                dest: /etc/systemd/system/nccl-plugin@.service
+                mode: 0o0644
+                content: |
+                  [Unit]
+                  After=network-online.target
+                  Before=slurmd.service
+
+                  [Service]
+                  Type=oneshot
+                  ExecStartPre=/usr/bin/rm -rf /usr/local/gib
+                  ExecStartPre=/usr/bin/mkdir -p /usr/local/gib
+                  ExecStartPre=/snap/bin/gcloud auth configure-docker --quiet us-docker.pkg.dev
+                  ExecStart=/usr/bin/docker run --rm --name nccl-gib-installer --volume /usr/local/gib:/var/lib/gib \
+                      us-docker.pkg.dev/gce-ai-infra/gpudirect-gib/nccl-plugin-gib:%i install --install-nccl
+
+                  [Install]
+                  WantedBy=slurmd.service
+              notify:
+              - Reload SystemD
+            handlers:
+            - name: Reload SystemD
+              ansible.builtin.systemd:
+                daemon_reload: true
+            post_tasks:
+            - name: Enable NCCL plugin SystemD unit
+              ansible.builtin.service:
+                name: nccl-plugin@$(vars.nccl_plugin_version).service
+                state: started
+                enabled: true
+
+  - id: a3_ultra_nodeset
+    source: community/modules/compute/schedmd-slurm-gcp-v6-nodeset
+    use: [a3ultra-slurm-net-0, a3ultra_startup]
+    settings:
+      bandwidth_tier: gvnic_enabled
+      machine_type: a3-ultragpu-8g
+      instance_image_custom: true
+      enable_public_ips: true
+      node_count_static: $(vars.a3u_cluster_size)
+      node_count_dynamic_max: 0
+      enable_placement: false
+      disk_type: hyperdisk-balanced
+      on_host_maintenance: TERMINATE
+      reservation_name: $(vars.a3u_reservation_name)
+      additional_networks:
+        $(concat(
+          [{
+            network=null,
+            subnetwork=a3ultra-slurm-net-1.subnetwork_self_link,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip="",
+            stack_type=null,
+            access_config=[],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          }],
+          a3ultra-slurm-rdma-net.subnetwork_interfaces
+        ))
+
+  - id: a3_ultra_partition
+    source: community/modules/compute/schedmd-slurm-gcp-v6-partition
+    use:
+    - a3_ultra_nodeset
+    settings:
+      exclusive: false
+      partition_name: a3ultra
+      is_default: true
+      partition_conf:
+        ResumeTimeout: 900
+        SuspendTimeout: 600
+
+  - id: slurm_login
+    source: community/modules/scheduler/schedmd-slurm-gcp-v6-login
+    use: [a3ultra-slurm-net-0]
+    settings:
+      instance_image_custom: true
+      disk_size_gb: 300
+      enable_login_public_ips: true
+      machine_type: n2-standard-8
+
+  - id: controller_startup
+    source: modules/scripts/startup-script
+    settings:
+      runners:
+      - type: shell
+        destination: stage_scripts.sh
+        content: |
+          #!/bin/bash
+          SLURM_ROOT=/opt/apps/adm/slurm
+          PARTITION_NAME=$(a3_ultra_partition.partitions[0].partition_name)
+          mkdir -m 0755 -p "${SLURM_ROOT}/scripts"
+          mkdir -p "${SLURM_ROOT}/partition-${PARTITION_NAME}-epilog_slurmd.d"
+          ln -s "/slurm/scripts/tools/gpu-test" "${SLURM_ROOT}/partition-${PARTITION_NAME}-epilog_slurmd.d/gpu-test.epilog_slurmd"
+
+  - id: slurm_controller
+    source: community/modules/scheduler/schedmd-slurm-gcp-v6-controller
+    use:
+    - a3ultra-slurm-net-0
+    - a3_ultra_partition
+    - slurm_login
+    - homefs
+    settings:
+      enable_controller_public_ips: true
+      instance_image_custom: true
+      disk_type: pd-extreme
+      disk_size_gb: 300
+      machine_type: n2-standard-80
+      controller_startup_script: $(controller_startup.startup_script)
+      enable_external_prolog_epilog: true
diff --git a/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-deployment.yaml b/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-deployment.yaml
new file mode 100644
index 0000000000..6fa29af09e
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-deployment.yaml
@@ -0,0 +1,26 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+---
+terraform_backend_defaults:
+  type: gcs
+  configuration:
+    bucket: # supply existing bucket to store Terraform state
+
+vars:
+  deployment_name: # supply unique deployment name
+  project_id: # supply existing project id
+  region: # supply region with a3-ultragpu-8g capacity in reservation
+  zone: # supply zone with a3-ultragpu-8g capacity in reservation
+  a3u_reservation_name: # supply a3-ultragpu-8g reservation name
+  a3u_cluster_size: # supply a3-ultragpu-8g reservation size
diff --git a/examples/machine-learning/a3-ultragpu-8g/a3ultra-vm.yaml b/examples/machine-learning/a3-ultragpu-8g/a3ultra-vm.yaml
new file mode 100644
index 0000000000..25d7fd83bf
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/a3ultra-vm.yaml
@@ -0,0 +1,151 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+
+blueprint_name: a3ultra-vm-instance
+
+vars:
+  project_id: # supply project ID
+  deployment_name: a3ultra-vm-instance
+  region: europe-west1
+  zone: europe-west1-b
+  instance_image:
+    project: ubuntu-os-accelerator-images
+    family: ubuntu-accelerator-2204-amd64-with-nvidia-550
+  net0_range: 192.168.0.0/19
+  net1_range: 192.168.64.0/18
+  filestore_ip_range: 192.168.32.0/24
+  rdma_net_range: 192.168.128.0/18
+  hostname_prefix: $(vars.deployment_name)-beowulf
+
+deployment_groups:
+- group: primary
+  modules:
+
+  - id: a3ultra-net-0
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-0
+      mtu: 8896
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-0
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net0_range)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-0
+        ranges: [$(vars.net0_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: a3ultra-net-1
+    source: modules/network/vpc
+    settings:
+      network_name: $(vars.deployment_name)-net-1
+      mtu: 8896
+      subnetworks:
+      - subnet_name: $(vars.deployment_name)-sub-1
+        subnet_region: $(vars.region)
+        subnet_ip: $(vars.net1_range)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-1
+        ranges: [$(vars.net1_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: a3ultra-rdma-net
+    source: modules/network/gpu-rdma-vpc
+    settings:
+      network_name: $(vars.deployment_name)-rdma-net
+      network_profile: https://www.googleapis.com/compute/beta/projects/$(vars.project_id)/global/networkProfiles/$(vars.zone)-vpc-roce
+      network_routing_mode: REGIONAL
+      subnetworks_template:
+        name_prefix: $(vars.deployment_name)-mrdma-sub
+        count: 8
+        ip_range: $(vars.rdma_net_range)
+        region: $(vars.region)
+      firewall_rules:
+      - name: $(vars.deployment_name)-internal-rdma
+        ranges: [$(vars.rdma_net_range)]
+        allow:
+        - protocol: tcp
+        - protocol: udp
+        - protocol: icmp
+
+  - id: homefs
+    source: modules/file-system/filestore
+    use: [a3ultra-net-0]
+    settings:
+      filestore_tier: HIGH_SCALE_SSD
+      size_gb: 10240
+      local_mount: /home
+      reserved_ip_range: $(vars.filestore_ip_range)
+    outputs:
+    - network_storage
+
+  - id: startup-script
+    source: modules/scripts/startup-script
+    settings:
+      configure_ssh_host_patterns:
+      - $(vars.hostname_prefix)-*
+
+  - id: a3ultra-vms
+    source: modules/compute/vm-instance
+    use: [startup-script, homefs]
+    settings:
+      machine_type: a3-ultragpu-8g
+      instance_count: 2
+      name_prefix: $(vars.hostname_prefix)
+      disk_type: hyperdisk-balanced
+      automatic_restart: true
+      on_host_maintenance: TERMINATE
+      reservation_name: # supply reservation name
+      network_interfaces:
+        $(concat(
+          [{
+            network=null,
+            subnetwork=a3ultra-net-0.subnetwork_self_link,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip=null,
+            stack_type=null,
+            access_config=[{nat_ip=null, public_ptr_domain_name=null, network_tier=null}],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          },
+          {
+            network=null,
+            subnetwork=a3ultra-net-1.subnetwork_self_link,
+            subnetwork_project=vars.project_id,
+            nic_type="GVNIC",
+            queue_count=null,
+            network_ip=null,
+            stack_type=null,
+            access_config=[{nat_ip=null, public_ptr_domain_name=null, network_tier=null}],
+            ipv6_access_config=[],
+            alias_ip_range=[]
+          }],
+          a3ultra-rdma-net.subnetwork_interfaces,
+        ))
+
+  - id: wait-for-vms
+    source: community/modules/scripts/wait-for-startup
+    settings:
+      instance_names: $(a3ultra-vms.name)
+      timeout: 7200
diff --git a/examples/machine-learning/a3-ultragpu-8g/nccl-tests/README.md b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/README.md
new file mode 100644
index 0000000000..3f6dfab5c9
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/README.md
@@ -0,0 +1,89 @@
+The examples in this directory are used to show how enroot + pyxis can be used
+to launch containerized workloads via Slurm.
+
+Contents:
+
+* `build-nccl-tests.sh`: A Slurm batch script for building the nccl-tests.
+* `run-nccl-tests.sh`: A Slurm batch script for running the nccl-tests
+  `all_reduce_perf` benchmark.
+* `import_container.sh`: Uses enroot to create a squashfs container image. Added
+  for reference only. enroot import happens within the `build-nccl-tests.sh`.
+
+# Running NCCL-Tests via Enroot/Pyxis
+
+In general the workflow to deploy GPUDirect-RDMA-enabled workloads via enroot-pyxis is
+the following:
+
+1. Convert your container into a squashfs based container image
+2. Set required environment variables
+3. Run your application workload
+
+## TLDR
+
+For an end-to-end example, copy the `build-nccl-tests.sh` and
+`run-nccl-tests.sh` to your login node.
+
+And run the following:
+
+```text
+BUILD_JOB=$(sbatch --parsable build-nccl-tests.sh) # takes ~4 minutes
+sbatch -d afterok:${BUILD_JOB} run-nccl-tests.sh # takes ~3 minutes
+```
+
+The latter should result in a slurm-XX.out file that contains the result of the nccl
+`all_gather_perf` benchmark:
+
+```text
+#
+#                                                              out-of-place                       in-place
+#       size         count      type   redop    root     time   algbw   busbw #wrong     time   algbw   busbw #wrong
+#        (B)    (elements)                               (us)  (GB/s)  (GB/s)            (us)  (GB/s)  (GB/s)
+   268435456       4194304     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+   536870912       8388608     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+  1073741824      16777216     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+  2147483648      33554432     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+  4294967296      67108864     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+  8589934592     134217728     float    none      -1    XXXXX  XXX.XX  XXX.XX    N/A   XXXXXX  XXX.XX  XXX.XX      0
+# Out of bounds values : 0 OK
+# Avg bus bandwidth    : XXX.XX
+#
+```
+
+For more details, follow the remainder of this README.
+
+## Detailed Instructions
+
+All of the following should be done on the login node of your slurm cluster,
+and while somewhere on the shared Filestore filesystem (typically the user's
+home directory).
+
+### Building NCCL-tests
+
+See build-nccl-tests.sh for an example. Within it, you will see that first we'll
+create a squashfs version of the container using we want to launch using `enroot
+import`. We do this because otherwise we'd be pulling the (typically more than
+10GB) image multiple times from the source on each node, converting to sqsh each
+time, etc, which would make the job launch longer.
+
+For building the nccl-tests binaries, we use `pyxis` to run the enroot container
+and build the nccl-tests within that container to ensure the resulting binarier
+are compatible with the container environment.
+
+Both of the above (importing and building) are accomplished by running:
+
+```text
+sbatch build-nccl-tests.sh
+```
+
+### Running your application on a3-ultra instances
+
+For a complete example, run:
+
+```text
+sbatch run-nccl-tests.sh
+```
+
+The output will appear in in a `slurm-<job#>.log` file. If the name of your a3-ultragpu
+partition is different than "a3ultra", you will need to modify the `build-nccl-tests.sh`
+and `run-nccl-tests.sh` scripts's  `#SBATCH --partition` setting. Alternatively, you
+can run `sbatch -p <your partition> <script>`.
diff --git a/examples/machine-learning/a3-ultragpu-8g/nccl-tests/build-nccl-tests.sh b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/build-nccl-tests.sh
new file mode 100644
index 0000000000..ee3578e241
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/build-nccl-tests.sh
@@ -0,0 +1,44 @@
+#!/bin/bash
+# Copyright 2024 "Google LLC"
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#SBATCH --exclusive
+#SBATCH --ntasks=1
+#SBATCH --partition=a3ultra
+#SBATCH --ntasks-per-node=1
+#SBATCH --gpus-per-node=8
+
+# Usage: sbatch build-nccl-tests.sh
+
+set -x
+
+CONTAINER_IMAGE=./nvidia+pytorch+24.09-py3.sqsh
+
+# Import the pytorch container to enroot if not already present.
+if [ ! -f ${CONTAINER_IMAGE} ]; then
+	# This creates a file named "nvidia+pytorch+24.09-py3.sqsh", which
+	# uses ~18 GB of disk space. This should be run on a filesystem that
+	# can be seen by all worker nodes
+	enroot import docker://nvcr.io#nvidia/pytorch:24.09-py3
+fi
+
+# Install nccl-tests using openmpi from within pytorch container
+srun --container-mounts="$PWD:/nccl" \
+	--container-image=${CONTAINER_IMAGE} \
+	bash -c "
+       cd /nccl &&
+       git clone https://github.com/NVIDIA/nccl-tests.git &&
+       cd /nccl/nccl-tests/ &&
+       MPI=1 CC=mpicc CXX=mpicxx make -j
+    "
diff --git a/examples/machine-learning/a3-ultragpu-8g/nccl-tests/import_pytorch_container.sh b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/import_pytorch_container.sh
new file mode 100644
index 0000000000..ea903be55a
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/import_pytorch_container.sh
@@ -0,0 +1,19 @@
+#!/bin/bash
+# Copyright 2024 "Google LLC"
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# This creates a file named "nvidia+pytorch+24.09-py3.sqsh", which
+# uses ~18 GB of disk space. This should be run on a filesystem that
+# can be seen by all worker nodes
+enroot import docker://nvcr.io#nvidia/pytorch:24.09-py3
diff --git a/examples/machine-learning/a3-ultragpu-8g/nccl-tests/run-nccl-tests.sh b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/run-nccl-tests.sh
new file mode 100644
index 0000000000..08352c1578
--- /dev/null
+++ b/examples/machine-learning/a3-ultragpu-8g/nccl-tests/run-nccl-tests.sh
@@ -0,0 +1,58 @@
+#!/bin/bash
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+#SBATCH --partition=a3ultra
+#SBATCH --mem=0
+#SBATCH -N 2
+#SBATCH --gpus-per-node=8
+#SBATCH --ntasks-per-node=8
+
+# Usage: sbatch run-nccl-tests.sh
+
+set -x
+# This should be set to the squashfs file that you created for your application
+CONTAINER_IMAGE=./nvidia+pytorch+24.09-py3.sqsh
+
+# Set up NCCL Environment variables
+# The following two can be useful for debugging
+# export NCCL_DEBUG=INFO
+# export NCCL_DEBUG_SUBSYS=INIT,NET
+
+# These parameters should not be modified
+source /usr/local/gib/scripts/set_nccl_env.sh
+export NCCL_NET=gIB
+export NCCL_SOCKET_IFNAME=enp0s19,enp192s20
+
+# Mount /var/tmp to allow the rest of the enroot container to be read-only, and
+# mount current $PWD to /nccl to for accessing nccl-tests binary
+CONTAINER_MOUNTS="/var/tmp:/var/tmp"
+
+# Mount PWD to /nccl in the enroot container
+CONTAINER_MOUNTS=${CONTAINER_MOUNTS},"$PWD:/nccl"
+
+# Mount required directories for gIB libnccl-net
+CONTAINER_MOUNTS=${CONTAINER_MOUNTS},"/usr/local/gib"
+
+# Run the workload
+srun -l \
+	-N "${SLURM_NNODES}" \
+	--ntasks-per-node=8 \
+	--mpi=pmi2 \
+	--container-image="${CONTAINER_IMAGE}" \
+	--container-mounts="${CONTAINER_MOUNTS}" \
+	sh -c "
+  export LD_LIBRARY_PATH=/usr/local/gib/lib64:/usr/lib/x86_64-linux-gnu:\$LD_LIBRARY_PATH;
+  /nccl/nccl-tests/build/all_gather_perf -b 256M -e 8G -f 2 -g 1 -w 5 --iters 200;
+  "
diff --git a/tools/cloud-build/daily-tests/ansible_playbooks/post-destroy-tasks/delete-image.yml b/tools/cloud-build/daily-tests/ansible_playbooks/post-destroy-tasks/delete-image.yml
new file mode 100644
index 0000000000..9ac0457ed4
--- /dev/null
+++ b/tools/cloud-build/daily-tests/ansible_playbooks/post-destroy-tasks/delete-image.yml
@@ -0,0 +1,30 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+---
+- name: Assert variables are defined
+  ansible.builtin.assert:
+    that:
+    - project is defined
+    - build is defined
+
+- name: Get Image Name
+  register: image_name
+  ansible.builtin.command: gcloud compute images list --project={{ project }} --no-standard-images --filter="labels.ghpc_deployment~{{ build }}" --format='get(name)' --limit=1
+  ignore_errors: yes
+
+- name: Delete Image
+  register: delete_image_result
+  changed_when: delete_image_result.rc == 0
+  ansible.builtin.command: gcloud compute images delete --project={{ project }} --quiet {{ image_name.stdout }}
+  when: image_name.rc == 0 and image_name.stdout != ""
diff --git a/tools/cloud-build/daily-tests/builds/ml-a3-ultragpu-slurm.yaml b/tools/cloud-build/daily-tests/builds/ml-a3-ultragpu-slurm.yaml
new file mode 100644
index 0000000000..9bab10c20e
--- /dev/null
+++ b/tools/cloud-build/daily-tests/builds/ml-a3-ultragpu-slurm.yaml
@@ -0,0 +1,52 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#     http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+tags:
+- m.custom-image
+- m.filestore
+- m.gpu-rdma-vpc
+- m.schedmd-slurm-gcp-v6-controller
+- m.schedmd-slurm-gcp-v6-login
+- m.schedmd-slurm-gcp-v6-nodeset
+- m.schedmd-slurm-gcp-v6-partition
+- m.startup-script
+- m.vpc
+- slurm6
+
+timeout: 14400s  # 4hr
+steps:
+- id: ml-a3-ultragpu-slurm
+  name: us-central1-docker.pkg.dev/$PROJECT_ID/hpc-toolkit-repo/test-runner
+  entrypoint: /bin/bash
+  env:
+  - "ANSIBLE_HOST_KEY_CHECKING=false"
+  - "ANSIBLE_CONFIG=/workspace/tools/cloud-build/ansible.cfg"
+  args:
+  - -c
+  - |
+    set -x -e
+    cd /workspace && make
+    BUILD_ID_FULL=$BUILD_ID
+    BUILD_ID_SHORT=$${BUILD_ID_FULL:0:6}
+    REGION=europe-west1
+    ZONE=europe-west1-b
+    BLUEPRINT="/workspace/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-blueprint.yaml"
+    sed -i -e '/deletion_protection:/{n;s/enabled: true/enabled: false/}' $${BLUEPRINT}
+    sed -i -e '/reason:/d' $${BLUEPRINT}
+    ansible-playbook tools/cloud-build/daily-tests/ansible_playbooks/slurm-integration-test.yml \
+        --user=sa_106486320838376751393 \
+        --extra-vars="project=${PROJECT_ID} build=$${BUILD_ID_SHORT}" \
+        --extra-vars="region=$${REGION} zone=$${ZONE}" \
+        --extra-vars="@tools/cloud-build/daily-tests/tests/ml-a3-ultragpu-slurm.yml"
diff --git a/tools/cloud-build/daily-tests/tests/ml-a3-ultragpu-slurm.yml b/tools/cloud-build/daily-tests/tests/ml-a3-ultragpu-slurm.yml
new file mode 100644
index 0000000000..87f2f65925
--- /dev/null
+++ b/tools/cloud-build/daily-tests/tests/ml-a3-ultragpu-slurm.yml
@@ -0,0 +1,45 @@
+# Copyright 2024 Google LLC
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+---
+
+# region, zone must be defined in build file with --extra-vars flag!
+test_name: a3u-slurm
+deployment_name: a3u-slurm-{{ build }}
+slurm_cluster_name: "a3u{{ build[0:4] }}"
+workspace: /workspace
+blueprint_yaml: "{{ workspace }}/examples/machine-learning/a3-ultragpu-8g/a3ultra-slurm-blueprint.yaml"
+login_node: "{{ slurm_cluster_name }}-slurm-login-*"
+controller_node: "{{ slurm_cluster_name }}-controller"
+region: europe-west1
+zone: europe-west1-b
+network: "{{ deployment_name }}-net-0"
+post_deploy_tests:
+- test-validation/test-mounts.yml
+- test-validation/test-partitions.yml
+- test-validation/test-enroot.yml
+post_destroy_tasks:
+- post-destroy-tasks/delete-image.yml
+custom_vars:
+  partitions:
+  - a3ultra
+  mounts:
+  - /home
+cli_deployment_vars:
+  region: "{{ region }}"
+  zone: "{{ zone }}"
+  slurm_cluster_name: "{{ slurm_cluster_name }}"
+  disk_size_gb: 200
+  a3u_cluster_size: 2
+  a3u_reservation_name: hpc-exfr-2
diff --git a/tools/cloud-build/daily-tests/validate_tests_metadata.py b/tools/cloud-build/daily-tests/validate_tests_metadata.py
index c734e984e7..9f07ef7b42 100644
--- a/tools/cloud-build/daily-tests/validate_tests_metadata.py
+++ b/tools/cloud-build/daily-tests/validate_tests_metadata.py
@@ -127,7 +127,7 @@ def check_tags(self, build_path: str) -> None:
         if missing_mod_tags:
             hint = "\n- ".join([""] + sorted(missing_mod_tags))
             self.fail(msg=f"Some used modules aren't declared\nHINT: add following tags to {build_path}: {hint}")
-        self.assertEquals(declared_mod_tags, required_mod_tags)
+        self.assertEqual(declared_mod_tags, required_mod_tags)
 
         self.assertNotEqual(tags & CATEGORICAL_TAGS, set(), msg=f"No categorical tags, pick/add one: {CATEGORICAL_TAGS}")