Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix tailscale #106

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Fix tailscale #106

wants to merge 4 commits into from

Conversation

jmacdonagh
Copy link

@jmacdonagh jmacdonagh commented Jan 4, 2025

Fix tailscale sysext so the service can start by default

The tailscale sysext had a number of issues as described in #105. Specifically:

  • The binaries were placed in a different directory than the vendor supplied systemd service expected
  • There was no EnvironmentFile, so service could not start up
  • networkd was managing the tailscale0 service so Tailscale couldn't enable MagicDNS.

How to use

  1. Build the sysext: ./create_tailscale_sysext.sh 1.76.6 tailscale and scp to a running Flatcar machine without the tailscale sysext already configured. ssh into the Flatcar machine.
  2. Ensure systemctl status tailscaled.service does not exist
  3. mv /path/to/tailscale.raw /etc/extensions/tailscale.raw
  4. systemd-sysext refresh
  5. Ensure systemctl status tailscaled.service now exists
  6. systemctl start tailscaled.service
  7. Ensure networkctl list shows tailscale0 as unmanaged (note, the state will show as degraded until you tailscale up which isn't needed for this test).

Testing done

  • Ran the above tests
  • Ran flatcar-reset with an Ignition config that fetched the tailscale.raw from an S3 bucket, and checked starting service / checked interface unmanaged / etc...

Closes: #105

- Place binaries in /usr/{bin,sbin} instead of /usr/local/{bin,sbin} to match provided systemd service definition
- Add tmpfiles.d config to copy vendor supplied tailscaled.defaults to /etc/default/tailscaled so service can start
Comment on lines 44 to 49
cat <<EOF >"${SYSEXTNAME}"/usr/lib/systemd/system/tailscaled.service.d/10-networkd-reload.conf
# Reload systemd-networkd.service to pick up 50-tailscale.network

[Service]
ExecStartPre=systemctl reload systemd-networkd.service
EOF
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to achieve this? The sysext will place /usr/lib/systemd/network/50-tailscale.network but that happens after systemd-networkd.service has started. It would be better if we had some mechanism to reload this service when this sysext is loaded.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some test and this does not seem required. I booted an instance without this drop-in and I see the tailscale0 link unmanaged as expected.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you run networkctl list what does it show for “TYPE” for tailscale0?

Mine was showing “none” but I expected it to show “tun” or “tunnel”. If it did, then a default networkctl config would have made it unmanaged

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there's some kind of race condition with my (minimal) Ignition config.

I observed what you have, with a minimal Ignition config, the link comes up umanaged.

However, if I add just a bit more config to Ignition (specifically, mount another volume to /var/lib/docker), the link repeatedly comes up as managed.

variant: flatcar
version: 1.0.0
storage:
  filesystems:
    - device: /dev/disk/by-id/scsi-0HC_Volume_XXXXXX
      format: ext4
      wipe_filesystem: false
      label: VOLUME
  files:
    - path: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
      contents:
        source: https://XXXXXX.s3.com/tailscale.raw
    - path: /etc/sysupdate.d/noop.conf
      contents:
        source: https://github.com/flatcar/sysext-bakery/releases/download/latest/noop.conf
    - path: /etc/sysupdate.tailscale.d/tailscale.conf
      contents:
        source: https://github.com/flatcar/sysext-bakery/releases/download/latest/tailscale.conf
  links:
    - path: /etc/resolv.conf
      target: /run/systemd/resolve/stub-resolv.conf
      overwrite: true
    - path: /etc/extensions/tailscale.raw
      target: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
      hard: false
    - path: /etc/systemd/system/multi-user.target.wants/tailscaled.service
      target: /usr/local/lib/systemd/system/tailscaled.service
      overwrite: true
systemd:
  units:
    # Docker volume mount
    - name: var-lib-docker.mount
      enabled: true
      contents: |
        [Unit]
        Description=Mount external volume to /var/lib/docker
        Before=local-fs.target
        [Mount]
        What=/dev/disk/by-label/VOLUME
        Where=/var/lib/docker
        Type=ext4
        [Install]
        WantedBy=local-fs.target
    - name: docker.service
      dropins:
        - name: 10-wait-docker.conf
          contents: |
            [Unit]
            After=var-lib-docker.mount
            Requires=var-lib-docker.mount
    # Tailscale sysext
    - name: systemd-sysupdate.timer
      enabled: true
    - name: systemd-sysupdate.service
      dropins:
        - name: tailscale.conf
          contents: |
            [Service]
            ExecStartPre=/usr/lib/systemd/systemd-sysupdate -C tailscale update
        - name: sysext.conf
          contents: |
            [Service]
            ExecStartPost=systemctl restart systemd-sysext

There's no explicit link between the docker mount and sysext, so I'm not sure what's going on.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, there is a race between systemd-sysext.service and systemd-networkd.service.

If I add a sleep 3 as an ExecStartPre to systemd-sysext.service then the link will come up as managed, even with a minimal config (shown below). Change to sleep 0 or remove the drop-in and it comes up unmanaged as expected.

variant: flatcar
version: 1.0.0
storage:
  files:
    - path: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
      contents:
        source:https://XXXXXX.s3.com/tailscale.raw
    - path: /etc/sysupdate.d/noop.conf
      contents:
        source: https://github.com/flatcar/sysext-bakery/releases/download/latest/noop.conf
    - path: /etc/sysupdate.tailscale.d/tailscale.conf
      contents:
        source: https://github.com/flatcar/sysext-bakery/releases/download/latest/tailscale.conf
  links:
    - path: /etc/resolv.conf
      target: /run/systemd/resolve/stub-resolv.conf
      overwrite: true
    - path: /etc/extensions/tailscale.raw
      target: /opt/extensions/tailscale/tailscale-1.76.6-x86-64.raw
      hard: false
    - path: /etc/systemd/system/multi-user.target.wants/tailscaled.service
      target: /usr/local/lib/systemd/system/tailscaled.service
      overwrite: true
systemd:
  units:
    # Make sysext wait a bit, remove this and link will be unmanaged
    - name: systemd-sysext.service
      dropins:
        - name: fake-wait.conf
          contents: |
            [Service]
            ExecStartPre=sleep 3
    # Tailscale sysext
    - name: systemd-sysupdate.timer
      enabled: true
    - name: systemd-sysupdate.service
      dropins:
        - name: tailscale.conf
          contents: |
            [Service]
            ExecStartPre=/usr/lib/systemd/systemd-sysupdate -C tailscale update
        - name: sysext.conf
          contents: |
            [Service]
            ExecStartPost=systemctl restart systemd-sysext

If we add an explicit dependency then the sleeps no longer expose the race. However, I can't provide this in the sysext because (of course) that's only loaded after systemd has decided to load both services in parallel.

systemd:
  units:
    - name: systemd-networkd.service
      dropins:
        - name: 10-after-sysext.conf
          contents: |
            [Unit]
            After=systemd-sysext.service

So, options are:

  1. Keep systemctl reload systemd-networkd.service as ExecStartPre in sysext provided tailscaled.service
  2. Add something to README.md to tell the user to add this drop-in to their Butane config 😞
  3. Change Flatcar upstream to add the above dependency by default 😬

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this investigation. On Flatcar there is a helper to run sysext provided units: https://github.com/flatcar/init/blob/b5a6cbcfaabe605e28e075b8ac674edaf576a0eb/systemd/system/ensure-sysext.service#L15 - I am wondering if we could not add a restart of systemd-networkd here.

Copy link
Author

@jmacdonagh jmacdonagh Jan 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. There is no need to restart systemd-networkd, a reload works perfectly and avoids bringing eth0 down and up.

I've added a PR to do exactly this: flatcar/init#128

Full disclosure, I've moved to just having tailscale run in docker instead of using this sysext, but happy to test/fix this for this PR ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Full disclosure, I've moved to just having tailscale run in docker instead of using this sysext, but happy to test/fix this for this PR ;)

Oh good to know. One of the goal of sysext-bakery is to provide alternatives when containers are not usable (e.g alternative container runtime, kernel modules, etc.) - there are a few exceptions of course (e.g Kubernetes components to benefit from auto-updates). If you have a tailscale setup running with containers, I think we could add some documentation here: https://www.flatcar.org/docs/latest/setup/customization/ and why not sunsetting this sysext image.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, you lose out on some optional tailscale functionality when running in Docker, such as built-in SSH and its file transfer features. These work, but since you're in the equivalent of a chroot it's not as useful as it running natively. I don't use those features so Docker works for me. In fact, it's an officially supported way of running Tailscale: https://tailscale.com/kb/1282/docker

All this being said, do you want another thing to maintain and update? 😄

I suppose another solution would be to build sysexts directly from Gentoo: https://packages.gentoo.org/packages/net-vpn/tailscale

But that'll quickly get tricky maintaining runtime dependencies.

Copy link
Contributor

@tormath1 tormath1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot, that looks good. Can I ask you to bump the released version to 1.78.1?

create_tailscale_sysext.sh Outdated Show resolved Hide resolved
Comment on lines 44 to 49
cat <<EOF >"${SYSEXTNAME}"/usr/lib/systemd/system/tailscaled.service.d/10-networkd-reload.conf
# Reload systemd-networkd.service to pick up 50-tailscale.network

[Service]
ExecStartPre=systemctl reload systemd-networkd.service
EOF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran some test and this does not seem required. I booted an instance without this drop-in and I see the tailscale0 link unmanaged as expected.

@jmacdonagh
Copy link
Author

Thanks a lot, that looks good. Can I ask you to bump the released version to 1.78.1?

Bumped in f3d6287

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

tailscaled.service cannot start due to incorrect path
2 participants