Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce privileged-mode #9017

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

A1kmm
Copy link

@A1kmm A1kmm commented Oct 24, 2024

The privileged-mode setting lets admins decide what level of privilege tasks running as privileged should have. This gives the ability to lock down privileged access to a level that isn't equivalent to full root on the host.

There are three proposed levels:
full, the status quo. This has multiple vectors to take over the host, including by loading modules into the kernel.
fuse-only, enough to work with containers using tools like buildah and podman if they are configured appropriately. As long as the Concourse worker is run in a user namespace on an up-to-date Linux kernel, this shouldn't be enough access to escape the container. ignore - privileged tasks have the same access as normal tasks.

To get podman and buildah working, a few more syscalls need to be allowed through seccomp. A few harmless ones have been added to the general allow list, while others related to mounting and unsharing are only added for fuse-only mode.

Changes proposed by this PR

  • Implement privileged-mode
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: full, can create container with buildah and run with podman.
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: fuse-only, can create container with buildah and run with podman. Capabilities are less.
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: fuse-only, cannot escape container using cgroup release_agent (note: can still escape if Worker not run in a new userns, setting release_agent fails if using a new userns).
  • Manual local testing: CONCOURSE_CONTAINERD_PRIVILEGED_MODE: ignore, cannot create container with buildah and run with podman, as expected.
  • Write automated tests for functionality.
  • Convert from draft PR to normal PR.

Notes to reviewer

This pipeline is helpful for manual testing:

jobs:
  - name: build-container
    public: false
    plan:
    - task: build
      privileged: true
      config:
        platform: linux
        image_resource:
          type: registry-image
          source:
            repository: quay.io/buildah/stable
        run:
          path: /bin/bash
          args:
            - "-c"
            - |
              capsh --print &&\
              yum -y install podman &&\
              mkdir container-storage &&\
              ls -l /dev/fuse /usr/bin/fuse-overlayfs $(pwd) $(pwd)/container-storage &&\
              PODMAN_ROOT=$(pwd)/container-storage &&\
              echo FROM mirror.gcr.io/alpine:latest >Dockerfile &&\
              echo CMD echo Hello World >>Dockerfile &&\
              buildah bud --root=$PODMAN_ROOT -t helloworld &&\
              echo "[containers]" >/etc/containers/containers.conf &&\
              echo "keyring = false" >>/etc/containers/containers.conf &&\
              podman run --rm --uts=host --network=host --userns=host --root=$PODMAN_ROOT --cgroups=disabled -it helloworld

Release Note

  • Added a new --privileged-mode option to the worker, which accepts full (default, previous behaviour), fuse-only (privileged: true tasks can use tools like buildah and podman, but can't escape if user namespaces are used to run the worker), ignore (privileged: true tasks have no extra access compared to privileged: false tasks)

@A1kmm A1kmm marked this pull request as ready for review October 25, 2024 09:56
@A1kmm A1kmm requested a review from a team as a code owner October 25, 2024 09:56
@taylorsilva taylorsilva added this to the v7.13.0 milestone Dec 4, 2024
@taylorsilva
Copy link
Member

Thanks for the PR! Will review it soon (hopefully)

Copy link
Member

@taylorsilva taylorsilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some confusing operator UX with this PR. You mention this adds a --privileged-mode flag, but it actually adds two flags: --containerd-privileged-mode and --baggageclaim-privileged-mode.

I can see that the value passed into --containerd-privileged-mode gets passed into baggageclaim. Could we consolidate to only exposing the --containerd-privileged-mode flag instead? It also means less flags to add to the chart and bosh deployment. This feature only makes sense for the containerd runtime, correct?

@A1kmm
Copy link
Author

A1kmm commented Jan 14, 2025

This feature only makes sense for the containerd runtime, correct?

It is only usable with containerd at the moment anyway; it would take further investigation (and possibly different options) for different backends, so limiting it to containerd makes sense.

Could we consolidate to only exposing the --containerd-privileged-mode flag instead?

The argument is used as a way to let the baggageclaim runner know the privileged mode. Perhaps the best solution is to pass that as a Go argument to BaggageClaimCommand.Runner instead of making it part of the structure - that would then eliminate the need for making it part of the command structure, and hence a flag.

If other backends start supporting a similar mode in the future, potentially the logic could then be extended to make baggageclaim do the right thing for all of them.

A1kmm added 2 commits January 15, 2025 23:07
The privileged-mode setting lets admins decide what level of privilege
tasks running as privileged should have. This gives the ability to
lock down privileged access to a level that isn't equivalent to full
root on the host.

There are three proposed levels:
full, the status quo. This has multiple vectors to take over the host,
including by loading modules into the kernel.
fuse-only, enough to work with containers using tools like buildah and
podman if they are configured appropriately. As long as the Concourse
worker is run in a user namespace on an up-to-date Linux kernel, this
shouldn't be enough access to escape the container.
ignore - privileged tasks have the same access as normal tasks.

To get podman and buildah working, a few more syscalls need to be
allowed through seccomp. A few harmless ones have been added to the
general allow list, while others related to mounting and unsharing
are only added for fuse-only mode.

Signed-off-by: Andrew Miller <[email protected]>
@A1kmm
Copy link
Author

A1kmm commented Jan 15, 2025

I have now updated the PR to not use the extra argument in the baggageclaim command structure, and instead just pass it using Go arguments @taylorsilva.

@A1kmm A1kmm requested a review from taylorsilva January 16, 2025 09:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In review
Development

Successfully merging this pull request may close these issues.

2 participants