diff --git a/README.md b/README.md index 032fbc8..76d110e 100644 --- a/README.md +++ b/README.md @@ -107,26 +107,26 @@ suites: ### Required parameters: -The following parameters should be set in the main `driver_config` section as they are common to all platforms: +The following parameters should be set in the main `driver` section as they are common to all platforms: - `vcenter_username` - Name to use when connecting to the vSphere environment - `vcenter_password` - Password associated with the specified user - `vcenter_host` - Host against which logins should be attempted -The following parameters should be set in the `driver_config` for the individual platform: +The following parameters should be set in the `driver` section for the individual platform: - `datacenter` - Name of the datacenter to use to deploy into - `template` - Template or virtual machine to use when cloning the new machine (needs to be a VM for linked clones) ### Optional Parameters -The following parameters should be set in the main `driver_config` section as they are common to all platforms: +The following parameters should be set in the main `driver` section as they are common to all platforms: - `vcenter_disable_ssl_verify` - Whether or not to disable SSL verification checks. Good when using self signed certificates. Default: false - `vm_wait_timeout` - Number of seconds to wait for VM connectivity. Default: 90 - `vm_wait_interval` - Check interval between tries on VM connectivity. Default: 2.0 - `vm_rollback` - Automatic roll back (destroy) of VMs failing the connectivity check. Default: false -The following optional parameters should be used in the `driver_config` for the platform. +The following optional parameters should be used in the `driver` for the platform. - `resource_pool` - Name of the resource pool to use when creating the machine. Default: first pool - `cluster` - Cluster on which the new virtual machine should be created. Default: cluster of the `targethost` machine. @@ -137,8 +137,13 @@ The following optional parameters should be used in the `driver_config` for the - `clone_type` - Type of clone, use "full" to create complete copies of template. Values: "full", "linked", "instant". Default: "full" - `network_name` - Network to reconfigure the first interface to, needs a VM Network name. Default: do not change - `tags` - Array of pre-defined vCenter tag names to assign (VMware tags are not key/value pairs). Default: none - - `customize` - Dictionary of `xsd:*`-type customizations like annotation, memoryMB or numCPUs (see [VirtualMachineConfigSpec](https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html)). Default: none + - `customize` - Dictionary of `xsd:*`-type customizations like annotation, memoryMB or numCPUs (see +[VirtualMachineConfigSpec](https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html)). Default: none - `interface`- VM Network name to use for kitchen connections. Default: not set = first interface with usable IP + - `aggressive` - Use aggressive IP retrieval to speed up provisioning. Default: false + - `aggressive_os` - OS family of the VM . Values: "linux", "windows". Default: autodetect from VMware + - `aggressive_username` - Username to access the VM. Default: "vagrant" + - `aggressive_password` - Password to access the VM. Default: "vagrant" ## Clone types @@ -176,6 +181,20 @@ duplicate the source MAC address, but get a different one. Architectural description see +## Aggressive mode + +This mode is used to speed up provisioning of kitchen machines as much as possible. One of the limiting factors despite actual provisioning time +(which can be improved using the linked/instant clone modes) is waiting for the VM to return its IP address. While VMware tools are usually available and +responding within 10-20 seconds, sending back IP/OS information to vCenter can take additional 30-40 seconds easily. + +Aggressive mode invokes OS specific commands for IP retrieval as soon as the VMware Tools are responding, by using the Guest Operations Manager +feature within the tools agent. Depending on the OS, a command to determine the IP will be executed using Bash (Linux) or CMD (Windows) and the +resulting output parsed. + +If retrieving the IP fails for some reason, the VMware Tools provided data is used as fallback. + +Aggressive mode can speed up tests and pipelines by up to 30 seconds, but may fail due to asynchronous OS interaction in some instances. + ## Contributing For information on contributing to this project see diff --git a/lib/kitchen/driver/vcenter.rb b/lib/kitchen/driver/vcenter.rb index 4e8f946..50b0cea 100644 --- a/lib/kitchen/driver/vcenter.rb +++ b/lib/kitchen/driver/vcenter.rb @@ -53,6 +53,10 @@ class Vcenter < Kitchen::Driver::Base default_config :vm_rollback, false default_config :customize, nil default_config :interface, nil + default_config :aggressive, false + default_config :aggressive_os, nil + default_config :aggressive_username, "vagrant" + default_config :aggressive_password, "vagrant" # The main create method # @@ -123,6 +127,10 @@ def create(state) wait_timeout: config[:vm_wait_timeout], wait_interval: config[:vm_wait_interval], customize: config[:customize], + aggressive: config[:aggressive], + aggressive_os: config[:aggressive_os], + aggressive_username: config[:aggressive_username], + aggressive_password: config[:aggressive_password], } begin diff --git a/lib/support/clone_vm.rb b/lib/support/clone_vm.rb index f4ddb7c..f801c3c 100644 --- a/lib/support/clone_vm.rb +++ b/lib/support/clone_vm.rb @@ -1,57 +1,157 @@ require "kitchen" require "rbvmomi" +require "support/guest_operations" class Support + class CloneError < RuntimeError; end + class CloneVm - attr_reader :vim, :options, :vm, :name, :path, :ip + attr_reader :vim, :options, :ssl_verify, :vm, :name, :ip def initialize(conn_opts, options) @options = options @name = options[:name] + @ssl_verify = !conn_opts[:insecure] # Connect to vSphere @vim ||= RbVmomi::VIM.connect conn_opts end - def get_ip(vm) - @ip = nil + def aggressive_discovery? + options[:aggressive] == true + end + + def ip_from_tools + return if vm.guest.net.empty? # Don't simply use vm.guest.ipAddress to allow specifying a different interface - unless vm.guest.net.empty? || !vm.guest.ipAddress - nics = vm.guest.net - if options[:interface] - nics.select! { |nic| nic.network == options[:interface] } + nics = vm.guest.net + if options[:interface] + nics.select! { |nic| nic.network == options[:interface] } - raise format("No interfaces found on VM which are attached to network '%s'", options[:interface]) if nics.empty? - end + raise Support::CloneError.new(format("No interfaces found on VM which are attached to network '%s'", options[:interface])) if nics.empty? + end - vm_ip = nil - nics.each do |net| - vm_ip = net.ipConfig.ipAddress.detect { |addr| addr.origin != "linklayer" } - break unless vm_ip.nil? - end + vm_ip = nil + nics.each do |net| + vm_ip = net.ipConfig.ipAddress.detect { |addr| addr.origin != "linklayer" } + break unless vm_ip.nil? + end - extended_msg = options[:interface] ? "Network #{options[:interface]}" : "" - raise format("No valid IP found on VM %s", extended_msg) if vm_ip.nil? + extended_msg = options[:interface] ? "Network #{options[:interface]}" : "" + raise Support::CloneError.new(format("No valid IP found on VM %s", extended_msg)) if vm_ip.nil? + vm_ip.ipAddress + end + + def wait_for_tools(timeout = 30.0, interval = 2.0) + start = Time.new - @ip = vm_ip.ipAddress + loop do + if vm.guest.toolsRunningStatus == "guestToolsRunning" + Kitchen.logger.debug format("Tools detected after %d seconds", Time.new - start) + return + end + break if (Time.new - start) >= timeout + sleep interval end - ip + raise Support::CloneError.new("Timeout waiting for VMware Tools") end - def wait_for_ip(vm, timeout = 30.0, interval = 2.0) + def wait_for_ip(timeout = 60.0, interval = 2.0) start = Time.new ip = nil loop do - ip = get_ip(vm) - break if ip || (Time.new - start) >= timeout + ip = ip_from_tools + if ip || (Time.new - start) >= timeout + Kitchen.logger.debug format("IP retrieved after %d seconds", Time.new - start) if ip + break + end sleep interval end - raise "Timeout waiting for IP address or no VMware Tools installed on guest" if ip.nil? - raise format("Error getting accessible IP address, got %s. Check DHCP server and scope exhaustion", ip) if ip =~ /^169\.254\./ + raise Support::CloneError.new("Timeout waiting for IP address") if ip.nil? + raise Support::CloneError.new(format("Error getting accessible IP address, got %s. Check DHCP server and scope exhaustion", ip)) if ip =~ /^169\.254\./ + + @ip = ip + end + + def detect_os + vm.config&.guestId =~ /^win/ ? :windows : :linux + end + + def standard_ip_discovery + Kitchen.logger.info format("Waiting for IP (timeout: %d seconds)...", options[:wait_timeout]) + wait_for_ip(options[:wait_timeout], options[:wait_interval]) + end + + def aggressive_ip_discovery + return unless aggressive_discovery? && !instant_clone? + + # Take guest OS from VM/Template configuration, if not explicitly configured + # @see https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.apiref.doc%2Fvim.vm.GuestOsDescriptor.GuestOsIdentifier.html + if options[:aggressive_os].nil? + os = detect_os + Kitchen.logger.warn format('OS for aggressive mode not configured, got "%s" from VMware', os.to_s.capitalize) + options[:aggressive_os] = os + end + + case options[:aggressive_os].downcase.to_sym + when :linux + discovery_command = "ip addr | grep global | cut -b10- | cut -d/ -f1" + when :windows + # discovery_command = "(Test-Connection -Computername $env:COMPUTERNAME -Count 1).IPV4Address.IPAddressToString" + discovery_command = "wmic nicconfig get IPAddress" + end + + username = options[:aggressive_username] + password = options[:aggressive_password] + guest_auth = RbVmomi::VIM::NamePasswordAuthentication(interactiveSession: false, username: username, password: password) + + Kitchen.logger.info "Attempting aggressive IP discovery" + begin + tools = Support::GuestOperations.new(vim, vm, guest_auth, ssl_verify) + stdout = tools.run_shell_capture_output(discovery_command) + + # Windows returns wrongly encoded UTF-8 for some reason + stdout = stdout.bytes.map { |b| (32..126).cover?(b.ord) ? b.chr : nil }.join unless stdout.ascii_only? + @ip = stdout.match(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/m).captures.first + rescue RbVmomi::Fault => e + if e.fault.class.wsdl_name == "InvalidGuestLogin" + message = format('Error authenticating to guest OS as "%s", check configuration of "aggressive_username"/"aggressive_password"', username) + end + + raise Support::CloneError.new(message) + rescue ::StandardError + Kitchen.logger.info "Aggressive discovery failed. Trying standard discovery method." + return false + end + + true + end + + def reconfigure_guest + Kitchen.logger.info "Waiting for reconfiguration to finish" + + # Pass contents of the customization option/Hash through to allow full customization + # https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html + config_spec = RbVmomi::VIM.VirtualMachineConfigSpec(options[:customize]) + + task = vm.ReconfigVM_Task(spec: config_spec) + task.wait_for_completion + end + + def instant_clone? + options[:clone_type] == :instant + end + + def linked_clone? + options[:clone_type] == :linked + end + + def full_clone? + options[:clone_type] == :full end def clone @@ -59,19 +159,19 @@ def clone dc = vim.serviceInstance.find_datacenter(options[:datacenter]) # reference template using full inventory path - root_folder = @vim.serviceInstance.content.rootFolder + root_folder = vim.serviceInstance.content.rootFolder inventory_path = format("/%s/vm/%s", options[:datacenter], options[:template]) src_vm = root_folder.findByInventoryPath(inventory_path) - raise format("Unable to find template: %s", options[:template]) if src_vm.nil? + raise Support::CloneError.new(format("Unable to find template: %s", options[:template])) if src_vm.nil? # Specify where the machine is going to be created relocate_spec = RbVmomi::VIM.VirtualMachineRelocateSpec # Setting the host is not allowed for instant clone due to VM memory sharing - relocate_spec.host = options[:targethost].host unless options[:clone_type] == :instant + relocate_spec.host = options[:targethost].host unless instant_clone? # Change to delta disks for linked clones - relocate_spec.diskMoveType = :moveChildMostDiskBacking if options[:clone_type] == :linked + relocate_spec.diskMoveType = :moveChildMostDiskBacking if linked_clone? # Set the resource pool relocate_spec.pool = options[:resource_pool] @@ -86,7 +186,7 @@ def clone network_device = all_network_devices.first networks = dc.network.select { |n| n.name == options[:network_name] } - raise format("Could not find network named %s", option[:network_name]) if networks.empty? + raise Support::CloneError.new(format("Could not find network named %s", option[:network_name])) if networks.empty? Kitchen.logger.warn format("Found %d networks named %s, picking first one", networks.count, options[:network_name]) if networks.count > 1 network_obj = networks.first @@ -110,7 +210,7 @@ def clone deviceName: options[:network_name] ) else - raise format("Unknown network type %s for network name %s", network_obj.class.to_s, options[:network_name]) + raise Support::CloneError.new(format("Unknown network type %s for network name %s", network_obj.class.to_s, options[:network_name])) end relocate_spec.deviceChange = [ @@ -125,33 +225,31 @@ def clone dest_folder = options[:folder].nil? ? dc.vmFolder : options[:folder][:id] Kitchen.logger.info format("Cloning '%s' to create the VM...", options[:template]) - if options[:clone_type] == :instant + if instant_clone? vcenter_data = vim.serviceInstance.content.about - raise "Instant clones only supported with vCenter 6.7 or higher" unless vcenter_data.version.to_f >= 6.7 + raise Support::CloneError.new("Instant clones only supported with vCenter 6.7 or higher") unless vcenter_data.version.to_f >= 6.7 Kitchen.logger.debug format("Detected %s", vcenter_data.fullName) resources = dc.hostFolder.children hosts = resources.select { |resource| resource.class.to_s =~ /ComputeResource$/ }.map { |c| c.host }.flatten targethost = hosts.select { |host| host.summary.config.name == options[:targethost].name }.first - raise "No matching ComputeResource found in host folder" if targethost.nil? + raise Support::CloneError.new("No matching ComputeResource found in host folder") if targethost.nil? esx_data = targethost.summary.config.product - raise "Instant clones only supported with ESX 6.7 or higher" unless esx_data.version.to_f >= 6.7 + raise Support::CloneError.new("Instant clones only supported with ESX 6.7 or higher") unless esx_data.version.to_f >= 6.7 Kitchen.logger.debug format("Detected %s", esx_data.fullName) # Other tools check for VMWare Tools status, but that will be toolsNotRunning on frozen VMs - raise "Need a running VM for instant clones" unless src_vm.runtime.powerState == "poweredOn" + raise Support::CloneError.new("Need a running VM for instant clones") unless src_vm.runtime.powerState == "poweredOn" # In first iterations, only support the Frozen Source VM workflow. This is more efficient # but needs preparations (freezing the source VM). Running Source VM support is to be # added later - raise "Need a frozen VM for instant clones, running source VM not supported yet" unless src_vm.runtime.instantCloneFrozen + raise Support::CloneError.new("Need a frozen VM for instant clones, running source VM not supported yet") unless src_vm.runtime.instantCloneFrozen # Swapping NICs not needed anymore (blog posts mention this), instant clones get a new # MAC at least with 6.7.0 build 9433931 - # @todo not working yet - # relocate_spec.folder = dest_folder clone_spec = RbVmomi::VIM.VirtualMachineInstantCloneSpec(location: relocate_spec, name: name) @@ -167,29 +265,22 @@ def clone # get the IP address of the machine for bootstrapping # machine name is based on the path, e.g. that includes the folder - @path = options[:folder].nil? ? name : format("%s/%s", options[:folder][:name], name) + path = options[:folder].nil? ? name : format("%s/%s", options[:folder][:name], name) @vm = dc.find_vm(path) + raise Support::CloneError.new(format("Unable to find machine: %s", path)) if vm.nil? - raise format("Unable to find machine: %s", path) if vm.nil? - - # Pass contents of the customization option/Hash through to allow full customization - # https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html - unless options[:customize].nil? - Kitchen.logger.info "Waiting for reconfiguration to finish" - - config_spec = RbVmomi::VIM.VirtualMachineConfigSpec(options[:customize]) - task = vm.ReconfigVM_Task(spec: config_spec) - task.wait_for_completion - end + reconfigure_guest unless options[:customize].nil? - if options[:poweron] && !options[:customize].nil? && options[:clone_type] != :instant + if options[:poweron] && !options[:customize].nil? && !instant_clone? task = vm.PowerOnVM_Task task.wait_for_completion end - Kitchen.logger.info format("Waiting for VMware tools/network interfaces to become available (timeout: %d seconds)...", options[:wait_timeout]) + Kitchen.logger.info format("Waiting for VMware tools to become available (timeout: %d seconds)...", options[:wait_timeout]) + wait_for_tools(options[:wait_timeout], options[:wait_interval]) + + aggressive_ip_discovery || standard_ip_discovery - wait_for_ip(vm, options[:wait_timeout], options[:wait_interval]) Kitchen.logger.info format("Created machine %s with IP %s", name, ip) end end diff --git a/lib/support/guest_operations.rb b/lib/support/guest_operations.rb new file mode 100644 index 0000000..b6df020 --- /dev/null +++ b/lib/support/guest_operations.rb @@ -0,0 +1,146 @@ +require "rbvmomi" +require "net/http" + +class Support + # Encapsulate VMware Tools GOM interaction, inspired by github:dnuffer/raidopt + class GuestOperations + attr_reader :gom, :vm, :guest_auth, :ssl_verify + + def initialize(vim, vm, guest_auth, ssl_verify = true) + @gom = vim.serviceContent.guestOperationsManager + @vm = vm + @guest_auth = guest_auth + @ssl_verify = ssl_verify + end + + def os_family + return vm.guest.guestFamily == "windowsGuest" ? :windows : :linux if vm.guest.guestFamily + + # VMware tools are not initialized or missing, infer from Guest Id + vm.config&.guestId =~ /^win/ ? :windows : :linux + end + + def linux? + os_family == :linux + end + + def windows? + os_family == :windows + end + + def delete_dir(dir) + gom.fileManager.DeleteDirectoryInGuest(vm: vm, auth: guest_auth, directoryPath: dir, recursive: true) + end + + def process_is_running(pid) + procs = gom.processManager.ListProcessesInGuest(vm: vm, auth: guest_auth, pids: [pid]) + procs.empty? || procs.any? { |gpi| gpi.exitCode.nil? } + end + + def process_exit_code(pid) + gom.processManager.ListProcessesInGuest(vm: vm, auth: guest_auth, pids: [pid])&.first&.exitCode + end + + def wait_for_process_exit(pid, timeout = 60.0, interval = 1.0) + start = Time.new + + loop do + return unless process_is_running(pid) + break if (Time.new - start) >= timeout + sleep interval + end + + raise format("Timeout waiting for process %d to exit after %d seconds", pid, timeout) if waitTime >= timeout + end + + def run_program(path, args = "", timeout = 60.0) + Kitchen.logger.debug format("Running %s %s", path, args) + + pid = gom.processManager.StartProgramInGuest(vm: vm, auth: guest_auth, spec: RbVmomi::VIM::GuestProgramSpec.new(programPath: path, arguments: args)) + wait_for_process_exit(pid, timeout) + + exit_code = process_exit_code(pid) + raise format("Failed to run '%s %s'. Exit code: %d", path, args, exit_code) if exit_code != 0 + + exit_code + end + + def run_shell_capture_output(command, shell = :auto, timeout = 60.0) + if linux? || shell == :linux + tmp_out_fname = format("/tmp/vm_utils_run_out_%s", Random.rand) + tmp_err_fname = format("/tmp/vm_utils_run_err_%s", Random.rand) + shell = "/bin/sh" + args = format("-c '(%s) > %s 2> %s'", command.gsub("'", %q{\\\'}), tmp_out_fname, tmp_err_fname) + elsif windows? || shell == :cmd + tmp_out_fname = format('C:\Windows\TEMP\vm_utils_run_out_%s', Random.rand) + tmp_err_fname = format('C:\Windows\TEMP\vm_utils_run_err_%s', Random.rand) + shell = "cmd.exe" + args = format('/c "%s > %s 2> %s"', command.gsub("\"", %q{\\\"}), tmp_out_fname, tmp_err_fname) + elsif shell == :powershell + tmp_out_fname = format('C:\Windows\TEMP\vm_utils_run_out_%s', Random.rand) + tmp_err_fname = format('C:\Windows\TEMP\vm_utils_run_err_%s', Random.rand) + shell = 'C:\Windows\System32\WindowsPowershell\v1.0\powershell.exe' + args = format('-Command "%s > %s 2> %s"', command.gsub("\"", %q{\\\"}), tmp_out_fname, tmp_err_fname) + end + + exit_code = run_program(shell, args, timeout) + proc_out = read_file(tmp_out_fname) + if exit_code != 0 + proc_err = read_file(tmp_err_fname) + raise format("Error executing command %s. Exit code: %d. StdErr %s", command, exit_code, proc_err) + end + + proc_out + end + + def write_file(remote_file, contents) + # Required privilege: VirtualMachine.GuestOperations.Modify + put_url = gom.fileManager.InitiateFileTransferToGuest( + vm: vm, + auth: guest_auth, + guestFilePath: remote_file, + fileAttributes: RbVmomi::VIM::GuestFileAttributes(), + fileSize: contents.size, + overwrite: true + ) + put_url = put_url.gsub(%r{^https://\*:}, format("https://%s:%s", vm._connection.host, put_url)) + uri = URI.parse(put_url) + + request = Net::HTTP::Put.new(uri.request_uri) + request["Transfer-Encoding"] = "chunked" + request["Content-Length"] = contents.size + request.body = contents + + http = Net::HTTP.new(uri.host, uri.port) + http.use_ssl = (uri.scheme == "https") + http.verify_mode = ssl_verify ? OpenSSL::SSL::VERIFY_PEER : OpenSSL::SSL::VERIFY_NONE + http.request(request) + end + + def read_file(remote_file) + download_file(remote_file, nil) + end + + def upload_file(local_file, remote_file) + Kitchen.logger.debug format("Copy %s to %s", local_file, remote_file) + write_file(remote_file, File.open(local_file, "rb").read) + end + + def download_file(remote_file, local_file) + info = gom.fileManager.InitiateFileTransferFromGuest(vm: vm, auth: guest_auth, guestFilePath: remote_file) + uri = URI.parse(info.url) + + request = Net::HTTP::Get.new(uri.request_uri) + http = Net::HTTP.new(uri.host, uri.port) + http.use_ssl = (uri.scheme == "https") + http.verify_mode = ssl_verify ? OpenSSL::SSL::VERIFY_PEER : OpenSSL::SSL::VERIFY_NONE + response = http.request(request) + + if response.body.size != info.size + raise format("Downloaded file has different size than reported: %s (%d bytes instead of %d bytes)", remote_file, response.body.size, info.size) + end + + local_file.nil? ? response.body : File.open(local_file, "w") { |file| file.write(response.body) } + end + end +end