Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Bottlerocket #47

Open
phillebaba opened this issue Mar 6, 2023 · 20 comments
Open

Support for Bottlerocket #47

phillebaba opened this issue Mar 6, 2023 · 20 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@phillebaba
Copy link
Member

Currently Spegel does not work with Bottlerocket when run with EKS.

https://github.com/bottlerocket-os/bottlerocket/blob/develop/README.md

This needs to be explored further to see if it is possible to support.

@phillebaba phillebaba added enhancement New feature or request help wanted Extra attention is needed labels Mar 6, 2023
@phillebaba
Copy link
Member Author

phillebaba commented Mar 13, 2023

After having a quick look at the Containerd configuration generated by Bottlerocket it seems like it does not currently support Spegel.

https://github.com/bottlerocket-os/bottlerocket/blob/7b67f9118151e67ac1e0972ef9d48eb3487c4ce9/packages/containerd/containerd-config-toml_k8s_containerd_sock#L47-L52

The only mirror configuration supported currently is the old syntax. There is an issue to track implementing the new syntax. It needs to be closed before Bottlerocket can be supported.
bottlerocket-os/bottlerocket/issues/1963

@mballoni
Copy link

I was about to test Spegel when I found this issue.
Is bottlerocket team aware of this kind of limitation?

Thank you

@phillebaba
Copy link
Member Author

@mballoni I don't know if there is any activity on the issue as it has been around since last year. I do not have time right now myself to contribute to the project so it depends on if anyone else prioritizes getting this done.

@empath-nirvana
Copy link

Is it possible to create a flag to disable the automatic node configuration, along with some docs for how to configure containerd on the node to make it work?

I could very easily tweak the user data on my bottlerocket nodes to have the right settings to work with spegel, I think -- the main issue is that spegel will crash because it tries to modify the settings itself and can't.

@bittrance
Copy link
Contributor

@empath-nirvana If you are talking about spegel writing containerd mirror configuration, it is actually documented, if not as visible as one might wish. Have a looke at https://github.com/spegel-org/spegel/tree/main/charts/spegel, specifically spegel.containerdMirrorAdd.

If you succeed with getting Bottlerocket nodes to work, we'd love an update to ./docs/COMPATIBILITY.md with instructions.

@empath-nirvana
Copy link

empath-nirvana commented May 3, 2024

Do you have some documentation on what the actual containerd configuration is supposed to be on the node? That might be useful for other OS's besides bottlerocket if you don't want spegel to configure it for you. I don't need bottlerocket-specific configuration, I can translate it to bottlerocket settings.

@bittrance
Copy link
Contributor

bittrance commented May 4, 2024

@empath-nirvana Spegel has no documentation about the exact configuration generated. This is because Spegel currently targets specific scenarios (see ./docs/COMPATIBILITY.md) and Spegel is supposed to "just work" in those scenarios. Currently, the mirror config injection works something like this:

The configuration is generated by the configuration subcommand.
The actual configuration generated is currently simple. It takes the list provided via spegel.registires and turns them into containerd hosts.yaml files. If you run Kind and Spegel locally (see PR #458), you can inspect the generated config:

$ docker exec kind-worker cat /etc/containerd/certs.d/ghcr.io/hosts.toml
server = 'https://ghcr.io'

[host]
[host.'http://172.19.0.4:30020']
capabilities = ['pull', 'resolve']

[host.'http://172.19.0.4:30021']
capabilities = ['pull', 'resolve']

@empath-nirvana
Copy link

Sorry to be a nuisance about this, but it's throwing an error message:

{"time":"2024-05-06T13:32:41.843433252Z","level":"ERROR","source":{"function":"main.main","file":"/build/main.go","line":88},"msg":"run exit with error","err":"Containerd registry config path needs to be set for mirror configuration to take effect"}

func verifyStatusResponse(resp *runtimeapi.StatusResponse, configPath string) error {
str, ok := resp.Info["config"]
if !ok {
return fmt.Errorf("could not get config data from info response")
}
cfg := &struct {
Registry struct {
ConfigPath string `json:"configPath"`
} `json:"registry"`
Containerd struct {
Runtimes struct {
DiscardUnpackedLayers bool `json:"discardUnpackedLayers"`
} `json:"runtimes"`
} `json:"containerd"`
}{}
err := json.Unmarshal([]byte(str), cfg)
if err != nil {
return err
}
if cfg.Containerd.Runtimes.DiscardUnpackedLayers {
return fmt.Errorf("Containerd discard unpacked layers cannot be enabled")
}
if cfg.Registry.ConfigPath == "" {
return fmt.Errorf("Containerd registry config path needs to be set for mirror configuration to take effect")
}
paths := filepath.SplitList(cfg.Registry.ConfigPath)
for _, path := range paths {
if path != configPath {
continue
}
return nil
}
return fmt.Errorf("Containerd registry config path is %s but needs to contain path %s for mirror configuration to take effect", cfg.Registry.ConfigPath, configPath)
}

Now, I haven't done any configuration on the bottlerocket node to add the mirror configuration, but it seems like that really shouldn't be throwing an error even if i haven't?

the config path is set in the args: - '--containerd-registry-config-path=/etc/containerd/certs.d'

I'm sort of digging through the code, but it looks like there's a lot more than just the initial setup that depends on that file existing, in which case it's not going to be a small amount of work to make bottlerocket support it...

@empath-nirvana
Copy link

empath-nirvana commented May 6, 2024

I commented out the verification lines that were throwing errors and rebuilt the container, and changed the helm chart so it binds to 127.0.0.1 instead of the node ip.. and set the bottle rocket config like this:

[[settings.container-registry.mirrors]]
"registry" = "gcr.io"
"endpoint" = ["127.0.0.1:30020", "127.0.0.1:30021"]

And i see messages like this in the logs:

"handling mirror request from external node"

So I assume that means it did work?

@bittrance
Copy link
Contributor

@empath-nirvana It seems you have it working then (though at this point it is hard to be completely sure since there currently is no way to see that a particular image arrived via spegel; see #24). Just so I understand fully, did you retain the check for "DiscardUnpackedLayers"?

@phillebaba I'm thinking the current Helm parameter spegel.containerdMirrorAdd should be repurposed (renamed?) so that it instructs the registry command to skip the config path check? The current implementation is somewhat odd in that it takes the config path parameter, but the only thing it is used for is matching against containerd's reported config dir. Were you planning to use it for something else?

If we go down this path, we are indirectly saying we should support containerds with the older config format. Is that problematic?

@phillebaba
Copy link
Member Author

No the check should be run no matter if the user wants Spegel to add the mirror configuration or not. Otherwise Spegel would start and people would think that things are working when they are not at all. We will never support the deprecated mirror configuration method. It has its limitations one of them being that the configuration is not hot reloaded. The second that it will be removed from Containerd.

It is important to understand the difference between the config path configuration and mirror configuration. The config path tells Containerd to look in a directory or directories for mirror configuration to load before pulling an image. This configuration is not hot reloaded and requires a restart of Containerd to take effect. The mirror configuration exists to tell Containerd about mirrors to try before attempting to pull the image from the original source. This configuration is hot reloaded as it is updated every time Containerd pulls an image.

The issue with Bottlerocket is that they have chosen not to update the Containerd configuration yet. The issue has been open for a while without much activity. While the work is not technically complicated to change, it may require some effort to figure out a migration path for existing users. I am not going to do this work as I am not a Bottlerocket user, so unless someone else will step up this issue will be blocked.

@empath-nirvana
Copy link

I'm not sure why hot reloading the containerd config is necessary if you have the correct configuration in the bottlerocket node when booting it?

@mballoni
Copy link

mballoni commented May 7, 2024

Wouldn't it be necessary every time a new node arrives or gets removed from the groups?

@phillebaba
Copy link
Member Author

In Bottlerockets case it would not be an issue as the configuration seems to be global and would be applied to all new nodes. Either way it is not something that I am going to support as the old configuration method is deprecated, and I do not have time debugging issues related to this. It would be a lot more valuable is Bottlerocket simply allowed for enabling the new configuration method instead. Or actually fixed it's mirror configuration to use the new method.

@empath-nirvana
Copy link

If you don't want to support it bottle rocket, that's fine. What I'm really asking for here is:

  1. Documentation for what the containerd configuration needs to be (whether it's hot reloaded or not)
  2. Documentation for how you can tell if it's actually working, which i believe there's already an open issue on.

Bottlerocket aside, there are going to be people who run on nodes where they won't have permissions to tweak containerd settings from a pod or don't want to, but may be able to update their node configuration in other ways.

@phillebaba
Copy link
Member Author

The required Containerd configuration is already documented in the Compatibility docs. All Containerd system configuration requires a restart for changes to apply.

https://github.com/spegel-org/spegel/blob/main/docs/COMPATIBILITY.md#compatibility

I have not come up with a good method of determining if Spegel is working or not. It is a best effort distributed cache, there are a lot of moving parts. A solution to run after install would be the easiest but would not verify that things are working later on. A continuously running check would be better but also be a lot more complicated. Maybe starting with a run once check would solve a lot of issues for initial debugging. Happy to take contributions if you have a simple and easy method to solve this.

Well if they are able to update their node configuration in other ways it is possible to disable Spegel from adding the mirror configuration. The only thing missing is to documented how the mirror configuration should look like, which would be an improvement to the docs. Spegel will however require that Containerd is properly configured, and restarted. How that is done is up to the user. Most cloud providers will add this configuration by default while others wont. If that configuration is not possible due to the provider, it will most likely be out of scope for Spegel to fix.

@wallentx
Copy link

@phillebaba Just so I understand this clearly- does the solution here https://github.com/spegel-org/spegel/blob/main/docs/COMPATIBILITY.md#eks not apply to bottlerocket nodes? After reading a few different issues on the bottlerocket-os project, I'm confused if the situation is:

  • Spegel doesn't function out-of-the-box on bottlerocket
    or
  • Spegel doesn't function at all on bottlerocket.

@phillebaba
Copy link
Member Author

Spegel does not work at all with Bottlerocket as far as I know, if the information in the issue bottlerocket-os/bottlerocket#1963 still holds true.

Currently Bottlerocket only uses the old mirror configuration for Containerd, which is not compatible with Spegel because it lacks certain features. Which is why I haven't spent time trying to figure out how to add compatibility. This old syntax will be removed in Containerd v2 so Bottlerocket should really update to the new syntax before it becomes a blocker to update Containerd version.

Sadly I am not a user of Bottlerocket so I don't really have the motivation to spend time to fix this. Technically it is not to difficult to fix the configuration, the challenge is documenting migration for existing users and verifying it wont break exiting use cases.

@empath-nirvana
Copy link

empath-nirvana commented Oct 10, 2024 via email

@jetersen
Copy link

@empath-nirvana can you share your bottle rocket user data addition? Perhaps with that in hand we can add a feature to toggle automatic containerd configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

6 participants