Skip to content

Latest commit

 

History

History
315 lines (238 loc) · 10.8 KB

Networking.asciidoc

File metadata and controls

315 lines (238 loc) · 10.8 KB

Networking in openQA

For tests using the QEMU backend the networking type used is controlled by the NICTYPE variable. If unset or empty NICTYPE defaults to user, i.e. QEMU User Networking which requires no further configuration.

For more advanced setups or tests that require multiple jobs to be in the same networking the TAP or VDE based modes can be used.

Other backends can be treated just the same as bare-metal setups. Tests can be triggered in parallel same as for QEMU based ones and synchronization primitives can be used. For the physical network according separation needs to be ensured externally where needed as means for machines to be able to access each other.

QEMU User Networking

With QEMU user networking each jobs gets its own isolated network with TCP and UDP routed to the outside. DHCP is provided by QEMU. The MAC address of the machine can be controlled with the NICMAC variable. If not set, it is 52:54:00:12:34:56.

TAP Based Network

os-autoinst can connect QEMU to TAP devices of the host system to leverage advanced network setups provided by the host by setting NICTYPE=tap.

The TAP device to use can be configured with the TAPDEV variable. If not defined, it is automatically set to "tap" + ($worker_instance - 1), i.e. worker1 uses tap0, worker 2 uses tap1 and so on.

For multiple networks per job (see NETWORKS variable), the following numbering scheme is used:

worker1: tap0 tap64 tap128 ...
worker2: tap1 tap65 tap129 ...
worker3: tap2 tap66 tap130 ...
...

The MAC address of each virtual NIC is controlled by the NICMAC variable or automatically computed from $worker_id if not set.

In TAP mode the system administrator is expected to configure the network, required internet access, etc. on the host as described in the next section.

Multi-machine test setup

The complete multi-machine test setup can be provided from the script os-autoinst-setup-multi-machine provided by "os-autoinst". The script can be also found online on https://github.com/os-autoinst/os-autoinst/blob/master/script/os-autoinst-setup-multi-machine

The configuration is applicable for openSUSE and will use Open vSwitch for virtual switch, firewalld (or SuSEfirewall2 for older versions) for NAT and wicked as network manager. Keep in mind that a firewall is not strictly necessary for operation. The operation without firewall is not covered in all necessary details in this documentation.

Note
Another way to setup the environment with iptables and firewalld is described on the Fedora wiki.
Note
Alternatively salt-states-openqa contains necessities to establish such a setup and configure it for all workers with the tap worker class. They also cover GRE tunnels (that are explained in the next section).

The script os-autoinst-setup-multi-machine can be run like this:

# specify the number of test VMs to run on this host
instances=30 bash -x $(which os-autoinst-setup-multi-machine)

What os-autoinst-setup-multi-machine does

Set up Open vSwitch

The script will install and configure Open vSwitch as well as a service called os-autoinst-openvswitch.service.

Note
os-autoinst-openvswitch.service is a support service that sets the vlan number of Open vSwitch ports based on NICVLAN variable - this separates the groups of tests from each other. The NICVLAN variable is dynamically assigned by the openQA scheduler.

The name of the bridge (default: br1) will be set in /etc/sysconfig/os-autoinst-openvswitch.

Configure virtual interfaces

The script will add the bridge device and the tap devices for every multi-machine worker instance.

Note
The bridge device will also call a script at /etc/wicked/scripts/gre_tunnel_preup.sh on PRE_UP. This script needs manual touch if you want to set up multiple multi-machine worker hosts. Refer to the GRE tunnels section below for further information.
Configure NAT with firewalld

The required firewall rules for masquerading (NAT) and zone configuration for the trusted zone will be set up. The bridge devices will be added to the zone. IP-Forwarding will be enabled.

# show the firewall configuration
firewall-cmd --list-all-zones

What is left to do after running os-autoinst-setup-multi-machine

GRE tunnels

By default all multi-machine workers have to be on a single physical machine. You can join multiple physical machines and its OVS bridges together by a GRE tunnel.

If the workers with TAP capability are spread across multiple hosts, the network must be connected. See Open vSwitch documentation for details.

Create a gre_tunnel_preup script (change the remote_ip value correspondingly on both hosts):

cat > /etc/wicked/scripts/gre_tunnel_preup.sh <<EOF
#!/bin/sh
action="$1"
bridge="$2"
ovs-vsctl set bridge $bridge stp_enable=true
ovs-vsctl --may-exist add-port $bridge gre1 -- set interface gre1 type=gre options:remote_ip=<IP address of other host>
EOF

And call it by PRE_UP_SCRIPT="wicked:gre_tunnel_preup.sh" entry:

# /etc/sysconfig/network/ifcfg-br1
<..>
PRE_UP_SCRIPT="wicked:gre_tunnel_preup.sh"

Ensure to make gre_tunnel_preup.sh executable.

Note
When using GRE tunnels keep in mind that virtual machines inside the ovs bridges have to use MTU=1458 for their physical interfaces (eth0, eth1). If you are using support_server/setup.pm the MTU will be set automatically to that value on support_server itself and it does MTU advertisement for DHCP clients as well.
Configure openQA workers

Allow worker instances to run multi-machine jobs:

# /etc/openqa/workers.ini
[global]
WORKER_CLASS = qemu_x86_64,tap
Note
The number of tap devices should correspond to the number of the running worker instances. For example, if you have set up 3 worker instances, the same number of tab devices should be configured.

Enable worker instances to be started on system boot:

systemctl enable openqa-worker@{1..3}

Verify the setup

Simply run a MM test scenario. For openSUSE, you can find many relevant tests on o3, e.g. look for networking-related tests like wicked-tests. To test GRE tunnels, you may want to change the jobs worker classes so the different jobs are executed on different workers.

So you could call openqa-clone-job like this:

openqa-clone-job \
    --skip-download --skip-chained-deps \    # assuming assets are present
    --max-depth 0 \                          # clone the entire parallel cluster
    --export-command \                       # only print the API call
    https://openqa.opensuse.org/tests/250309 # arbitrary job in cluster to clone
    _GROUP=0 BUILD+=test-mm-setup            # avoid interfering with production jobs

It will print an openqa-cli call. You can modify it to change the worker classes of the jobs individually and then invoke it.

Also be sure to reboot the worker host to make sure the setup is actually persistent.

Debugging Open vSwitch Configuration

Boot sequence with wicked (version 0.6.23 and newer):

  1. openvswitch (as above)

  2. wicked - creates the bridge br1 and tap devices, adds tap devices to the bridge,

  3. firewalld (or SuSEfirewall2 in older setups)

  4. os-autoinst-openvswitch - installs openflow rules, handles vlan assignment

The configuration and operation can be checked with the following commands:

ovs-vsctl show # shows the bridge br1, the tap devices are assigned to it
ovs-ofctl dump-flows br1 # shows the rules installed by os-autoinst-openvswitch in table=0
ovs-dpctl show # show basic info on all datapaths
ovs-dpctl dump-flows # displays flows in datapaths

When everything is ok and the machines are able to communicate, the ovs-vsctl should show something like the following:

Bridge "br0"
    Port "br0"
        Interface "br0"
            type: internal
    Port "tap0"
        Interface "tap0"
    Port "tap1"
        tag: 1
        Interface "tap1"
    Port "tap2"
        tag: 1
        Interface "tap2"
  ovs_version: "2.11.1"
Note
Notice the tag numbers are assigned to tap1 and tap2. They should have the same number.
Note
If the balance of the tap devices is wrong in the workers.ini the tag cannot be assigned and the communication will be broken.

To list the rules which are effectively configured in the underlying netfilter (nftables or iptables) use one of the following commands depending on which netfilter is used.

Note
Whether firewalld is using nftables or iptables is determined by the setting FirewallBackend in /etc/firewalld/firewalld.conf. SuSEfirewall2 is always using iptables.
nft list tables           # list all tables
nft list table firewalld  # list all rules in the specified table
iptables --list --verbose # list all rules with package counts

Check the flow of packets over the network:

  • packets from tapX to br1 create additional rules in table=1

  • packets from br1 to tapX increase packet counts in table=1

  • empty output indicates a problem with os-autoinst-openvswitch service

  • zero packet count or missing rules in table=1 indicate problem with tap devices

As long as the SUT has access to external network, there should be a non-zero packet count in the forward chain between the br1 and external interface.

Note
To list the package count when nftables is used one needed to use counters (which can be added to existing rules).

VDE Based Network

Virtual Distributed Ethernet provides a software switch that runs in user space. It allows to connect several QEMU instances without affecting the system’s network configuration.

The openQA workers need a vde_switch instance running. The workers reconfigure the switch as needed by the job.

Basic, Single Machine Tests

To start with a basic configuration like QEMU user mode networking, create a machine with the following settings:

  • VDE_SOCKETDIR=/run/openqa

  • NICTYPE=vde

  • NICVLAN=0

Start the switch and user mode networking:

systemctl enable --now openqa-vde_switch
systemctl enable --now openqa-slirpvde

With this setting all jobs on the same host would be in the same network and share the same SLIRP instance.