Skip to content

Commit

Permalink
SlurmGCP. Deprecate enable_smt, add advanced_machine_features ins…
Browse files Browse the repository at this point in the history
…tead
  • Loading branch information
mr0re1 committed Jan 10, 2025
1 parent 5f7f407 commit 5d9f1c4
Show file tree
Hide file tree
Showing 28 changed files with 257 additions and 138 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ modules. For support with the underlying modules, see the instructions in the
| <a name="input_access_config"></a> [access\_config](#input\_access\_config) | Access configurations, i.e. IPs via which the VM instance can be accessed via the Internet. | <pre>list(object({<br/> nat_ip = string<br/> network_tier = string<br/> }))</pre> | `[]` | no |
| <a name="input_additional_disks"></a> [additional\_disks](#input\_additional\_disks) | Configurations of additional disks to be included on the partition nodes. | <pre>list(object({<br/> disk_name = string<br/> device_name = string<br/> disk_size_gb = number<br/> disk_type = string<br/> disk_labels = map(string)<br/> auto_delete = bool<br/> boot = bool<br/> }))</pre> | `[]` | no |
| <a name="input_additional_networks"></a> [additional\_networks](#input\_additional\_networks) | Additional network interface details for GCE, if any. | <pre>list(object({<br/> network = string<br/> subnetwork = string<br/> subnetwork_project = string<br/> network_ip = string<br/> nic_type = string<br/> stack_type = string<br/> queue_count = number<br/> access_config = list(object({<br/> nat_ip = string<br/> network_tier = string<br/> }))<br/> ipv6_access_config = list(object({<br/> network_tier = string<br/> }))<br/> alias_ip_range = list(object({<br/> ip_cidr_range = string<br/> subnetwork_range_name = string<br/> }))<br/> }))</pre> | `[]` | no |
| <a name="input_advanced_machine_features"></a> [advanced\_machine\_features](#input\_advanced\_machine\_features) | See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#nested_advanced_machine_features | <pre>object({<br/> enable_nested_virtualization = optional(bool)<br/> threads_per_core = optional(number)<br/> turbo_mode = optional(string)<br/> visible_core_count = optional(number)<br/> performance_monitoring_unit = optional(string)<br/> enable_uefi_networking = optional(bool)<br/> })</pre> | <pre>{<br/> "threads_per_core": 1<br/>}</pre> | no |
| <a name="input_allow_automatic_updates"></a> [allow\_automatic\_updates](#input\_allow\_automatic\_updates) | If false, disables automatic system package updates on the created instances. This feature is<br/>only available on supported images (or images derived from them). For more details, see<br/>https://cloud.google.com/compute/docs/instances/create-hpc-vm#disable_automatic_updates | `bool` | `true` | no |
| <a name="input_bandwidth_tier"></a> [bandwidth\_tier](#input\_bandwidth\_tier) | Configures the network interface card and the maximum egress bandwidth for VMs.<br/> - Setting `platform_default` respects the Google Cloud Platform API default values for networking.<br/> - Setting `virtio_enabled` explicitly selects the VirtioNet network adapter.<br/> - Setting `gvnic_enabled` selects the gVNIC network adapter (without Tier 1 high bandwidth).<br/> - Setting `tier_1_enabled` selects both the gVNIC adapter and Tier 1 high bandwidth networking.<br/> - Note: both gVNIC and Tier 1 networking require a VM image with gVNIC support as well as specific VM families and shapes.<br/> - See [official docs](https://cloud.google.com/compute/docs/networking/configure-vm-with-high-bandwidth-configuration) for more details. | `string` | `"platform_default"` | no |
| <a name="input_can_ip_forward"></a> [can\_ip\_forward](#input\_can\_ip\_forward) | Enable IP forwarding, for NAT instances for example. | `bool` | `false` | no |
Expand All @@ -101,7 +102,7 @@ modules. For support with the underlying modules, see the instructions in the
| <a name="input_enable_oslogin"></a> [enable\_oslogin](#input\_enable\_oslogin) | Enables Google Cloud os-login for user login and authentication for VMs.<br/>See https://cloud.google.com/compute/docs/oslogin | `bool` | `true` | no |
| <a name="input_enable_public_ips"></a> [enable\_public\_ips](#input\_enable\_public\_ips) | If set to true. The node group VMs will have a random public IP assigned to it. Ignored if access\_config is set. | `bool` | `false` | no |
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration. Note: the instance image must support option. | `bool` | `false` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | Enables Simultaneous Multi-Threading (SMT) on instance. | `bool` | `false` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | DEPRECATED: Use `advanced_machine_features.threads_per_core` instead. | `bool` | `null` | no |
| <a name="input_enable_spot_vm"></a> [enable\_spot\_vm](#input\_enable\_spot\_vm) | Enable the partition to use spot VMs (https://cloud.google.com/spot-vms). | `bool` | `false` | no |
| <a name="input_feature"></a> [feature](#input\_feature) | The node feature, used to bind nodes to the nodeset. If not set, the nodeset name will be used. | `string` | `null` | no |
| <a name="input_guest_accelerator"></a> [guest\_accelerator](#input\_guest\_accelerator) | List of the type and count of accelerator cards attached to the instance. | <pre>list(object({<br/> type = string,<br/> count = number<br/> }))</pre> | `[]` | no |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -84,11 +84,11 @@ module "slurm_nodeset_template" {
bandwidth_tier = var.bandwidth_tier
can_ip_forward = var.can_ip_forward

disable_smt = !var.enable_smt
enable_confidential_vm = var.enable_confidential_vm
enable_oslogin = var.enable_oslogin
enable_shielded_vm = var.enable_shielded_vm
shielded_instance_config = var.shielded_instance_config
advanced_machine_features = var.advanced_machine_features
enable_confidential_vm = var.enable_confidential_vm
enable_oslogin = var.enable_oslogin
enable_shielded_vm = var.enable_shielded_vm
shielded_instance_config = var.shielded_instance_config

labels = local.labels
machine_type = var.machine_type
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -208,10 +208,29 @@ variable "can_ip_forward" {
default = false
}

variable "enable_smt" {
variable "advanced_machine_features" {
description = "See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#nested_advanced_machine_features"
type = object({
enable_nested_virtualization = optional(bool)
threads_per_core = optional(number)
turbo_mode = optional(string)
visible_core_count = optional(number)
performance_monitoring_unit = optional(string)
enable_uefi_networking = optional(bool)
})
default = {
threads_per_core = 1 # disable SMT by default
}
}

variable "enable_smt" { # tflint-ignore: terraform_unused_declarations
type = bool
description = "Enables Simultaneous Multi-Threading (SMT) on instance."
default = false
description = "DEPRECATED: Use `advanced_machine_features.threads_per_core` instead."
default = null
validation {
condition = var.enable_smt == null
error_message = "DEPRECATED: Use `advanced_machine_features.threads_per_core` instead."
}
}

variable "labels" {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -163,6 +163,7 @@ modules. For support with the underlying modules, see the instructions in the
| <a name="input_access_config"></a> [access\_config](#input\_access\_config) | Access configurations, i.e. IPs via which the VM instance can be accessed via the Internet. | <pre>list(object({<br/> nat_ip = string<br/> network_tier = string<br/> }))</pre> | `[]` | no |
| <a name="input_additional_disks"></a> [additional\_disks](#input\_additional\_disks) | Configurations of additional disks to be included on the partition nodes. | <pre>list(object({<br/> disk_name = string<br/> device_name = string<br/> disk_size_gb = number<br/> disk_type = string<br/> disk_labels = map(string)<br/> auto_delete = bool<br/> boot = bool<br/> }))</pre> | `[]` | no |
| <a name="input_additional_networks"></a> [additional\_networks](#input\_additional\_networks) | Additional network interface details for GCE, if any. | <pre>list(object({<br/> network = optional(string)<br/> subnetwork = string<br/> subnetwork_project = optional(string)<br/> network_ip = optional(string, "")<br/> nic_type = optional(string)<br/> stack_type = optional(string)<br/> queue_count = optional(number)<br/> access_config = optional(list(object({<br/> nat_ip = string<br/> network_tier = string<br/> })), [])<br/> ipv6_access_config = optional(list(object({<br/> network_tier = string<br/> })), [])<br/> alias_ip_range = optional(list(object({<br/> ip_cidr_range = string<br/> subnetwork_range_name = string<br/> })), [])<br/> }))</pre> | `[]` | no |
| <a name="input_advanced_machine_features"></a> [advanced\_machine\_features](#input\_advanced\_machine\_features) | See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#nested_advanced_machine_features | <pre>object({<br/> enable_nested_virtualization = optional(bool)<br/> threads_per_core = optional(number)<br/> turbo_mode = optional(string)<br/> visible_core_count = optional(number)<br/> performance_monitoring_unit = optional(string)<br/> enable_uefi_networking = optional(bool)<br/> })</pre> | <pre>{<br/> "threads_per_core": 1<br/>}</pre> | no |
| <a name="input_allow_automatic_updates"></a> [allow\_automatic\_updates](#input\_allow\_automatic\_updates) | If false, disables automatic system package updates on the created instances. This feature is<br/>only available on supported images (or images derived from them). For more details, see<br/>https://cloud.google.com/compute/docs/instances/create-hpc-vm#disable_automatic_updates | `bool` | `true` | no |
| <a name="input_bandwidth_tier"></a> [bandwidth\_tier](#input\_bandwidth\_tier) | Configures the network interface card and the maximum egress bandwidth for VMs.<br/> - Setting `platform_default` respects the Google Cloud Platform API default values for networking.<br/> - Setting `virtio_enabled` explicitly selects the VirtioNet network adapter.<br/> - Setting `gvnic_enabled` selects the gVNIC network adapter (without Tier 1 high bandwidth).<br/> - Setting `tier_1_enabled` selects both the gVNIC adapter and Tier 1 high bandwidth networking.<br/> - Note: both gVNIC and Tier 1 networking require a VM image with gVNIC support as well as specific VM families and shapes.<br/> - See [official docs](https://cloud.google.com/compute/docs/networking/configure-vm-with-high-bandwidth-configuration) for more details. | `string` | `"platform_default"` | no |
| <a name="input_can_ip_forward"></a> [can\_ip\_forward](#input\_can\_ip\_forward) | Enable IP forwarding, for NAT instances for example. | `bool` | `false` | no |
Expand All @@ -179,7 +180,7 @@ modules. For support with the underlying modules, see the instructions in the
| <a name="input_enable_placement"></a> [enable\_placement](#input\_enable\_placement) | Enable placement groups. | `bool` | `true` | no |
| <a name="input_enable_public_ips"></a> [enable\_public\_ips](#input\_enable\_public\_ips) | If set to true. The node group VMs will have a random public IP assigned to it. Ignored if access\_config is set. | `bool` | `false` | no |
| <a name="input_enable_shielded_vm"></a> [enable\_shielded\_vm](#input\_enable\_shielded\_vm) | Enable the Shielded VM configuration. Note: the instance image must support option. | `bool` | `false` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | Enables Simultaneous Multi-Threading (SMT) on instance. | `bool` | `false` | no |
| <a name="input_enable_smt"></a> [enable\_smt](#input\_enable\_smt) | DEPRECATED: Use `advanced_machine_features.threads_per_core` instead. | `bool` | `null` | no |
| <a name="input_enable_spot_vm"></a> [enable\_spot\_vm](#input\_enable\_spot\_vm) | Enable the partition to use spot VMs (https://cloud.google.com/spot-vms). | `bool` | `false` | no |
| <a name="input_future_reservation"></a> [future\_reservation](#input\_future\_reservation) | If set, will make use of the future reservation for the nodeset. Input can be either the future reservation name or its selfLink in the format 'projects/PROJECT\_ID/zones/ZONE/futureReservations/FUTURE\_RESERVATION\_NAME'.<br/>See https://cloud.google.com/compute/docs/instances/future-reservations-overview | `string` | `""` | no |
| <a name="input_guest_accelerator"></a> [guest\_accelerator](#input\_guest\_accelerator) | List of the type and count of accelerator cards attached to the instance. | <pre>list(object({<br/> type = string,<br/> count = number<br/> }))</pre> | `[]` | no |
Expand Down
10 changes: 5 additions & 5 deletions community/modules/compute/schedmd-slurm-gcp-v6-nodeset/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,6 @@ locals {

bandwidth_tier = var.bandwidth_tier
can_ip_forward = var.can_ip_forward
disable_smt = !var.enable_smt

enable_confidential_vm = var.enable_confidential_vm
enable_placement = var.enable_placement
Expand All @@ -85,10 +84,11 @@ locals {
enable_shielded_vm = var.enable_shielded_vm
gpu = one(local.guest_accelerator)

labels = local.labels
machine_type = terraform_data.machine_type_zone_validation.output
metadata = local.metadata
min_cpu_platform = var.min_cpu_platform
labels = local.labels
machine_type = terraform_data.machine_type_zone_validation.output
advanced_machine_features = var.advanced_machine_features
metadata = local.metadata
min_cpu_platform = var.min_cpu_platform

on_host_maintenance = var.on_host_maintenance
preemptible = var.preemptible
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -227,10 +227,29 @@ variable "can_ip_forward" {
default = false
}

variable "enable_smt" {
variable "advanced_machine_features" {
description = "See https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_instance_template#nested_advanced_machine_features"
type = object({
enable_nested_virtualization = optional(bool)
threads_per_core = optional(number)
turbo_mode = optional(string)
visible_core_count = optional(number)
performance_monitoring_unit = optional(string)
enable_uefi_networking = optional(bool)
})
default = {
threads_per_core = 1 # disable SMT by default
}
}

variable "enable_smt" { # tflint-ignore: terraform_unused_declarations
type = bool
description = "Enables Simultaneous Multi-Threading (SMT) on instance."
default = false
description = "DEPRECATED: Use `advanced_machine_features.threads_per_core` instead."
default = null
validation {
condition = var.enable_smt == null
error_message = "DEPRECATED: Use `advanced_machine_features.threads_per_core` instead."
}
}

variable "labels" {
Expand Down
Loading

0 comments on commit 5d9f1c4

Please sign in to comment.