Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement aggressive IP polling mode #66

Merged
merged 1 commit into from
Apr 24, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 24 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -107,26 +107,26 @@ suites:

### Required parameters:

The following parameters should be set in the main `driver_config` section as they are common to all platforms:
The following parameters should be set in the main `driver` section as they are common to all platforms:

- `vcenter_username` - Name to use when connecting to the vSphere environment
- `vcenter_password` - Password associated with the specified user
- `vcenter_host` - Host against which logins should be attempted

The following parameters should be set in the `driver_config` for the individual platform:
The following parameters should be set in the `driver` section for the individual platform:

- `datacenter` - Name of the datacenter to use to deploy into
- `template` - Template or virtual machine to use when cloning the new machine (needs to be a VM for linked clones)

### Optional Parameters

The following parameters should be set in the main `driver_config` section as they are common to all platforms:
The following parameters should be set in the main `driver` section as they are common to all platforms:
- `vcenter_disable_ssl_verify` - Whether or not to disable SSL verification checks. Good when using self signed certificates. Default: false
- `vm_wait_timeout` - Number of seconds to wait for VM connectivity. Default: 90
- `vm_wait_interval` - Check interval between tries on VM connectivity. Default: 2.0
- `vm_rollback` - Automatic roll back (destroy) of VMs failing the connectivity check. Default: false

The following optional parameters should be used in the `driver_config` for the platform.
The following optional parameters should be used in the `driver` for the platform.

- `resource_pool` - Name of the resource pool to use when creating the machine. Default: first pool
- `cluster` - Cluster on which the new virtual machine should be created. Default: cluster of the `targethost` machine.
Expand All @@ -137,8 +137,13 @@ The following optional parameters should be used in the `driver_config` for the
- `clone_type` - Type of clone, use "full" to create complete copies of template. Values: "full", "linked", "instant". Default: "full"
- `network_name` - Network to reconfigure the first interface to, needs a VM Network name. Default: do not change
- `tags` - Array of pre-defined vCenter tag names to assign (VMware tags are not key/value pairs). Default: none
- `customize` - Dictionary of `xsd:*`-type customizations like annotation, memoryMB or numCPUs (see [VirtualMachineConfigSpec](https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html)). Default: none
- `customize` - Dictionary of `xsd:*`-type customizations like annotation, memoryMB or numCPUs (see
[VirtualMachineConfigSpec](https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html)). Default: none
- `interface`- VM Network name to use for kitchen connections. Default: not set = first interface with usable IP
- `aggressive` - Use aggressive IP retrieval to speed up provisioning. Default: false
- `aggressive_os` - OS family of the VM . Values: "linux", "windows". Default: autodetect from VMware
- `aggressive_username` - Username to access the VM. Default: "vagrant"
- `aggressive_password` - Password to access the VM. Default: "vagrant"

## Clone types

Expand Down Expand Up @@ -176,6 +181,20 @@ duplicate the source MAC address, but get a different one.

Architectural description see <https://www.virtuallyghetto.com/2018/04/new-instant-clone-architecture-in-vsphere-6-7-part-1.html>

## Aggressive mode

This mode is used to speed up provisioning of kitchen machines as much as possible. One of the limiting factors despite actual provisioning time
(which can be improved using the linked/instant clone modes) is waiting for the VM to return its IP address. While VMware tools are usually available and
responding within 10-20 seconds, sending back IP/OS information to vCenter can take additional 30-40 seconds easily.

Aggressive mode invokes OS specific commands for IP retrieval as soon as the VMware Tools are responding, by using the Guest Operations Manager
feature within the tools agent. Depending on the OS, a command to determine the IP will be executed using Bash (Linux) or CMD (Windows) and the
resulting output parsed.

If retrieving the IP fails for some reason, the VMware Tools provided data is used as fallback.

Aggressive mode can speed up tests and pipelines by up to 30 seconds, but may fail due to asynchronous OS interaction in some instances.

## Contributing

For information on contributing to this project see <https://github.com/chef/chef/blob/master/CONTRIBUTING.md>
Expand Down
8 changes: 8 additions & 0 deletions lib/kitchen/driver/vcenter.rb
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,10 @@ class Vcenter < Kitchen::Driver::Base
default_config :vm_rollback, false
default_config :customize, nil
default_config :interface, nil
default_config :aggressive, false
default_config :aggressive_os, nil
default_config :aggressive_username, "vagrant"
default_config :aggressive_password, "vagrant"

# The main create method
#
Expand Down Expand Up @@ -123,6 +127,10 @@ def create(state)
wait_timeout: config[:vm_wait_timeout],
wait_interval: config[:vm_wait_interval],
customize: config[:customize],
aggressive: config[:aggressive],
aggressive_os: config[:aggressive_os],
aggressive_username: config[:aggressive_username],
aggressive_password: config[:aggressive_password],
}

begin
Expand Down
195 changes: 143 additions & 52 deletions lib/support/clone_vm.rb
Original file line number Diff line number Diff line change
@@ -1,77 +1,177 @@
require "kitchen"
require "rbvmomi"
require "support/guest_operations"

class Support
class CloneError < RuntimeError; end

class CloneVm
attr_reader :vim, :options, :vm, :name, :path, :ip
attr_reader :vim, :options, :ssl_verify, :vm, :name, :ip

def initialize(conn_opts, options)
@options = options
@name = options[:name]
@ssl_verify = !conn_opts[:insecure]

# Connect to vSphere
@vim ||= RbVmomi::VIM.connect conn_opts
end

def get_ip(vm)
@ip = nil
def aggressive_discovery?
options[:aggressive] == true
end

def ip_from_tools
return if vm.guest.net.empty?

# Don't simply use vm.guest.ipAddress to allow specifying a different interface
unless vm.guest.net.empty? || !vm.guest.ipAddress
nics = vm.guest.net
if options[:interface]
nics.select! { |nic| nic.network == options[:interface] }
nics = vm.guest.net
if options[:interface]
nics.select! { |nic| nic.network == options[:interface] }

raise format("No interfaces found on VM which are attached to network '%s'", options[:interface]) if nics.empty?
end
raise Support::CloneError.new(format("No interfaces found on VM which are attached to network '%s'", options[:interface])) if nics.empty?
end

vm_ip = nil
nics.each do |net|
vm_ip = net.ipConfig.ipAddress.detect { |addr| addr.origin != "linklayer" }
break unless vm_ip.nil?
end
vm_ip = nil
nics.each do |net|
vm_ip = net.ipConfig.ipAddress.detect { |addr| addr.origin != "linklayer" }
break unless vm_ip.nil?
end

extended_msg = options[:interface] ? "Network #{options[:interface]}" : ""
raise format("No valid IP found on VM %s", extended_msg) if vm_ip.nil?
extended_msg = options[:interface] ? "Network #{options[:interface]}" : ""
raise Support::CloneError.new(format("No valid IP found on VM %s", extended_msg)) if vm_ip.nil?
vm_ip.ipAddress
end

def wait_for_tools(timeout = 30.0, interval = 2.0)
start = Time.new

@ip = vm_ip.ipAddress
loop do
if vm.guest.toolsRunningStatus == "guestToolsRunning"
Kitchen.logger.debug format("Tools detected after %d seconds", Time.new - start)
return
end
break if (Time.new - start) >= timeout
sleep interval
end

ip
raise Support::CloneError.new("Timeout waiting for VMware Tools")
end

def wait_for_ip(vm, timeout = 30.0, interval = 2.0)
def wait_for_ip(timeout = 60.0, interval = 2.0)
start = Time.new

ip = nil
loop do
ip = get_ip(vm)
break if ip || (Time.new - start) >= timeout
ip = ip_from_tools
if ip || (Time.new - start) >= timeout
Kitchen.logger.debug format("IP retrieved after %d seconds", Time.new - start) if ip
break
end
sleep interval
end

raise "Timeout waiting for IP address or no VMware Tools installed on guest" if ip.nil?
raise format("Error getting accessible IP address, got %s. Check DHCP server and scope exhaustion", ip) if ip =~ /^169\.254\./
raise Support::CloneError.new("Timeout waiting for IP address") if ip.nil?
raise Support::CloneError.new(format("Error getting accessible IP address, got %s. Check DHCP server and scope exhaustion", ip)) if ip =~ /^169\.254\./

@ip = ip
end

def detect_os
vm.config&.guestId =~ /^win/ ? :windows : :linux
end

def standard_ip_discovery
Kitchen.logger.info format("Waiting for IP (timeout: %d seconds)...", options[:wait_timeout])
wait_for_ip(options[:wait_timeout], options[:wait_interval])
end

def aggressive_ip_discovery
return unless aggressive_discovery? && !instant_clone?

# Take guest OS from VM/Template configuration, if not explicitly configured
# @see https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.apiref.doc%2Fvim.vm.GuestOsDescriptor.GuestOsIdentifier.html
if options[:aggressive_os].nil?
os = detect_os
Kitchen.logger.warn format('OS for aggressive mode not configured, got "%s" from VMware', os.to_s.capitalize)
options[:aggressive_os] = os
end

case options[:aggressive_os].downcase.to_sym
when :linux
discovery_command = "ip addr | grep global | cut -b10- | cut -d/ -f1"
when :windows
# discovery_command = "(Test-Connection -Computername $env:COMPUTERNAME -Count 1).IPV4Address.IPAddressToString"
discovery_command = "wmic nicconfig get IPAddress"
end

username = options[:aggressive_username]
password = options[:aggressive_password]
guest_auth = RbVmomi::VIM::NamePasswordAuthentication(interactiveSession: false, username: username, password: password)

Kitchen.logger.info "Attempting aggressive IP discovery"
begin
tools = Support::GuestOperations.new(vim, vm, guest_auth, ssl_verify)
stdout = tools.run_shell_capture_output(discovery_command)

# Windows returns wrongly encoded UTF-8 for some reason
stdout = stdout.bytes.map { |b| (32..126).cover?(b.ord) ? b.chr : nil }.join unless stdout.ascii_only?
@ip = stdout.match(/([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})/m).captures.first
rescue RbVmomi::Fault => e
if e.fault.class.wsdl_name == "InvalidGuestLogin"
message = format('Error authenticating to guest OS as "%s", check configuration of "aggressive_username"/"aggressive_password"', username)
end

raise Support::CloneError.new(message)
rescue ::StandardError
Kitchen.logger.info "Aggressive discovery failed. Trying standard discovery method."
return false
end

true
end

def reconfigure_guest
Kitchen.logger.info "Waiting for reconfiguration to finish"

# Pass contents of the customization option/Hash through to allow full customization
# https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html
config_spec = RbVmomi::VIM.VirtualMachineConfigSpec(options[:customize])

task = vm.ReconfigVM_Task(spec: config_spec)
task.wait_for_completion
end

def instant_clone?
options[:clone_type] == :instant
end

def linked_clone?
options[:clone_type] == :linked
end

def full_clone?
options[:clone_type] == :full
end

def clone
# set the datacenter name
dc = vim.serviceInstance.find_datacenter(options[:datacenter])

# reference template using full inventory path
root_folder = @vim.serviceInstance.content.rootFolder
root_folder = vim.serviceInstance.content.rootFolder
inventory_path = format("/%s/vm/%s", options[:datacenter], options[:template])
src_vm = root_folder.findByInventoryPath(inventory_path)
raise format("Unable to find template: %s", options[:template]) if src_vm.nil?
raise Support::CloneError.new(format("Unable to find template: %s", options[:template])) if src_vm.nil?

# Specify where the machine is going to be created
relocate_spec = RbVmomi::VIM.VirtualMachineRelocateSpec

# Setting the host is not allowed for instant clone due to VM memory sharing
relocate_spec.host = options[:targethost].host unless options[:clone_type] == :instant
relocate_spec.host = options[:targethost].host unless instant_clone?

# Change to delta disks for linked clones
relocate_spec.diskMoveType = :moveChildMostDiskBacking if options[:clone_type] == :linked
relocate_spec.diskMoveType = :moveChildMostDiskBacking if linked_clone?

# Set the resource pool
relocate_spec.pool = options[:resource_pool]
Expand All @@ -86,7 +186,7 @@ def clone
network_device = all_network_devices.first

networks = dc.network.select { |n| n.name == options[:network_name] }
raise format("Could not find network named %s", option[:network_name]) if networks.empty?
raise Support::CloneError.new(format("Could not find network named %s", option[:network_name])) if networks.empty?

Kitchen.logger.warn format("Found %d networks named %s, picking first one", networks.count, options[:network_name]) if networks.count > 1
network_obj = networks.first
Expand All @@ -110,7 +210,7 @@ def clone
deviceName: options[:network_name]
)
else
raise format("Unknown network type %s for network name %s", network_obj.class.to_s, options[:network_name])
raise Support::CloneError.new(format("Unknown network type %s for network name %s", network_obj.class.to_s, options[:network_name]))
end

relocate_spec.deviceChange = [
Expand All @@ -125,33 +225,31 @@ def clone
dest_folder = options[:folder].nil? ? dc.vmFolder : options[:folder][:id]

Kitchen.logger.info format("Cloning '%s' to create the VM...", options[:template])
if options[:clone_type] == :instant
if instant_clone?
vcenter_data = vim.serviceInstance.content.about
raise "Instant clones only supported with vCenter 6.7 or higher" unless vcenter_data.version.to_f >= 6.7
raise Support::CloneError.new("Instant clones only supported with vCenter 6.7 or higher") unless vcenter_data.version.to_f >= 6.7
Kitchen.logger.debug format("Detected %s", vcenter_data.fullName)

resources = dc.hostFolder.children
hosts = resources.select { |resource| resource.class.to_s =~ /ComputeResource$/ }.map { |c| c.host }.flatten
targethost = hosts.select { |host| host.summary.config.name == options[:targethost].name }.first
raise "No matching ComputeResource found in host folder" if targethost.nil?
raise Support::CloneError.new("No matching ComputeResource found in host folder") if targethost.nil?

esx_data = targethost.summary.config.product
raise "Instant clones only supported with ESX 6.7 or higher" unless esx_data.version.to_f >= 6.7
raise Support::CloneError.new("Instant clones only supported with ESX 6.7 or higher") unless esx_data.version.to_f >= 6.7
Kitchen.logger.debug format("Detected %s", esx_data.fullName)

# Other tools check for VMWare Tools status, but that will be toolsNotRunning on frozen VMs
raise "Need a running VM for instant clones" unless src_vm.runtime.powerState == "poweredOn"
raise Support::CloneError.new("Need a running VM for instant clones") unless src_vm.runtime.powerState == "poweredOn"

# In first iterations, only support the Frozen Source VM workflow. This is more efficient
# but needs preparations (freezing the source VM). Running Source VM support is to be
# added later
raise "Need a frozen VM for instant clones, running source VM not supported yet" unless src_vm.runtime.instantCloneFrozen
raise Support::CloneError.new("Need a frozen VM for instant clones, running source VM not supported yet") unless src_vm.runtime.instantCloneFrozen

# Swapping NICs not needed anymore (blog posts mention this), instant clones get a new
# MAC at least with 6.7.0 build 9433931

# @todo not working yet
# relocate_spec.folder = dest_folder
clone_spec = RbVmomi::VIM.VirtualMachineInstantCloneSpec(location: relocate_spec,
name: name)

Expand All @@ -167,29 +265,22 @@ def clone

# get the IP address of the machine for bootstrapping
# machine name is based on the path, e.g. that includes the folder
@path = options[:folder].nil? ? name : format("%s/%s", options[:folder][:name], name)
path = options[:folder].nil? ? name : format("%s/%s", options[:folder][:name], name)
@vm = dc.find_vm(path)
raise Support::CloneError.new(format("Unable to find machine: %s", path)) if vm.nil?

raise format("Unable to find machine: %s", path) if vm.nil?

# Pass contents of the customization option/Hash through to allow full customization
# https://pubs.vmware.com/vsphere-6-5/index.jsp?topic=%2Fcom.vmware.wssdk.smssdk.doc%2Fvim.vm.ConfigSpec.html
unless options[:customize].nil?
Kitchen.logger.info "Waiting for reconfiguration to finish"

config_spec = RbVmomi::VIM.VirtualMachineConfigSpec(options[:customize])
task = vm.ReconfigVM_Task(spec: config_spec)
task.wait_for_completion
end
reconfigure_guest unless options[:customize].nil?

if options[:poweron] && !options[:customize].nil? && options[:clone_type] != :instant
if options[:poweron] && !options[:customize].nil? && !instant_clone?
task = vm.PowerOnVM_Task
task.wait_for_completion
end

Kitchen.logger.info format("Waiting for VMware tools/network interfaces to become available (timeout: %d seconds)...", options[:wait_timeout])
Kitchen.logger.info format("Waiting for VMware tools to become available (timeout: %d seconds)...", options[:wait_timeout])
wait_for_tools(options[:wait_timeout], options[:wait_interval])

aggressive_ip_discovery || standard_ip_discovery

wait_for_ip(vm, options[:wait_timeout], options[:wait_interval])
Kitchen.logger.info format("Created machine %s with IP %s", name, ip)
end
end
Expand Down
Loading