-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migration Design: Rename Bundle to ClusterBundle #485
base: main
Are you sure you want to change the base?
Changes from all commits
be2f40d
702eb96
553f662
dd44ef1
c559344
19cebb2
78d85b1
6b04cd9
ce7c573
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,252 @@ | ||
# Design: Renaming Bundle to ClusterBundle | ||
|
||
- [Release Signoff Checklist](#release-signoff-checklist) | ||
- [Summary](#summary) | ||
- [Motivation](#motivation) | ||
- [Goals](#goals) | ||
- [Non-Goals](#non-goals) | ||
- [Proposal](#proposal) | ||
- [Risks and Mitigations](#risks-and-mitigations) | ||
- [Design Details](#design-details) | ||
- [Test Plan](#test-plan) | ||
- [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy) | ||
- [Alternatives](#alternatives) | ||
- [Future Work](#future-work) | ||
|
||
## Release Signoff Checklist | ||
|
||
This checklist contains actions which must be completed before a PR implementing this design can be | ||
merged. | ||
|
||
- [ ] This design doc has been discussed and approved | ||
- [ ] Test plan has been agreed upon and the tests implemented | ||
- [ ] User-facing documentation has been PR-ed against the release branch in [cert-manager/website] | ||
|
||
TODO: add more items to checklist | ||
|
||
## Summary | ||
|
||
We want to rename `Bundle` to `ClusterBundle` for reasons described in "Motivation" below. | ||
|
||
Conversion webhooks in Kubernetes can only be used to convert between versions of an API, | ||
and not to convert between resources. This makes renaming harder to do, as it will represent | ||
deleting the old API and creating a new API. | ||
|
||
Even though trust-manager `Bundle` API is officially in an alpha state, we want to provide a smooth | ||
migration path for our users. For this reason we are proposing to do this change over multiple releases, | ||
requiring end-user action on the way, but without risking trust-manager target resources - which can provide | ||
critical features in a cluster. For additional details, please see "Proposal" below. | ||
|
||
## Motivation | ||
|
||
We have [an open issue](https://github.com/cert-manager/trust-manager/issues/63) suggesting renaming Bundle to ClusterBundle. | ||
Might seem like an obvious WontFix, considering all the assumed work (and pain) to do this, | ||
but it turns out that the current name somehow blocks introducing an equivalent namespace-scoped resource - | ||
for trust bundle management inside namespaces in multi-tenant clusters. | ||
|
||
In addition, `ClusterBundle` would be a better name looking at cert-manager having `Issuer` and `ClusterIssuer`. | ||
The name would also be better aligned with upstream [ClusterTrustBundle](https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/#cluster-trust-bundles). | ||
|
||
By pushing users to migrate smoothly to a new API, we can perform requested API changes at the same time. | ||
Consider it similar to a one-way conversion webhook. | ||
|
||
Since trust-manager is maturing and getting more attention, we suggest establishing its own API group: `trust-manager.io`. | ||
|
||
### Goals | ||
|
||
- `Bundle` resource is renamed to `ClusterBundle` | ||
- `ClusterBundle` is created in the new `trust-manager.io` API group | ||
- Provide a smooth migration for users of trust-manager | ||
- Use the opportunity window to improve the API | ||
- [Issue summarizing proposed API changes (#242)](https://github.com/cert-manager/trust-manager/issues/242) | ||
- [Draft PR for proposed API changes done when migrating (#486)](https://github.com/cert-manager/trust-manager/pull/486) | ||
|
||
|
||
### Non-Goals | ||
|
||
- Behavior changes in `Bundle`/`ClusterBundle` API while migrating (only simpler API improvements) | ||
- Introduce a new API version | ||
- Introduce a namespace-scoped bundle API (at least not now) | ||
|
||
## Proposal | ||
|
||
We propose to do this change over multiple releases: | ||
|
||
1. Version X: Only `Bundle` API is served and has the functional controller. (present state) | ||
2. Version X+1: We introduce `ClusterBundle` with API otherwise identical to `Bundle`: | ||
- `Bundle` is marked as deprecated. | ||
- `ClusterBundle` introduced with agreed API improvements. | ||
- Functional controller is migrated to `ClusterBundle` | ||
- Validating webhook for both `Bundle` and `ClusterBundle` with same validations | ||
- New technical controller added for `Bundle`, converting bundles to cluster bundles | ||
3. Version Y: Remove `Bundle` and only serve `ClusterBundle` (desired state) | ||
erikgb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Some of these actions/states require extra attention, which are further described below. | ||
|
||
### Functional Controller Migration | ||
|
||
This should be rather simple. Since the APIs are identical, we just have to perform a "smart" | ||
search and replace in the controller code. When the controller initially boots up on the X+1 version, | ||
there will be no `ClusterBundle` resources in the cluster, as these will be created/updated by the technical | ||
controller. | ||
|
||
The `ClusterBundle` reconciler should do the same as for `Bundle` in version X, but we should test/verify that | ||
the controller is able to change owner references on target resources from `Bundle` to `ClusterBundle` without | ||
facing issues. | ||
|
||
We must also ensure that the `ClusterBundle` reconciler isn't deleting target resources it thinks are "orphaned". | ||
We believe the controller will only delete target resources if/when the controller resource exists, and | ||
otherwise rely on the Kubernetes garbage collector to delete orphans. But this should be tested/verified. | ||
|
||
### Temporary Technical Controller | ||
|
||
When the functional controller is migrated to target `ClusterBundle`, we want to introduce a temporary technical | ||
controller targeting the now deprecated `Bundle` resource. | ||
|
||
This controller should just create or update the `ClusterBundle` with the same name as the reconciled `Bundle`. | ||
Since the APIs should be quite similar, this should be a simple controller. | ||
|
||
- `Bundle` `.spec` should be copied (and eventually slightly converted) into `ClusterBundle` `.spec` | ||
- `Bundle` `.status` should be updated from `ClusterBundle` `.status` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should also add a clear message in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. CRDs supports deprecating APIs/versions, and we can tailor the warning message. Isn't this enough? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Im pretty sure the CRD deprecation warning tells you every time you run kubectl against it, we can test this though to make sure thats true with all the commands (describe, get, apply). If that is the case then I think a CRD deprecation warning is enough |
||
|
||
To allow a user to migrate their resources from `Bundle` to `ClusterBundle`, the controller should not act | ||
on `Bundle` delete events - nor should it add an owner reference to `ClusterBundle` to avoid the Kubernetes garbage collector | ||
acting on deletes. | ||
|
||
For reconciling a `ClusterBundle` from `Bundle` we should investigate the possibility of "smart field manager name logic". | ||
It should be possible to read the field manager name used (by user) to create/update the `Bundle` resource. | ||
If we use the same field manager name in controller (per resource) to create/update `ClusterBundle`, we think it's less | ||
likely that users will have to force conflicts when migrating from `Bundle` to `ClusterBundle`. | ||
|
||
### Proposed API changes when migrating | ||
|
||
Improving the trust-manager API while migrating is not something we have to do, but | ||
appears like nice opportunity to fix some of the long-standing proposed API changes. | ||
|
||
There is an [open draft PR for proposed (and breaking) API changes](https://github.com/cert-manager/trust-manager/pull/486), | ||
but in order to make this a non-braking change for users, only the `ClusterBundle` API will include these changes. | ||
|
||
While this is subject to change, an example on how a user would migrate from `Bundle` to `ClusterBundle` without | ||
changing anything for target resources is shown below: | ||
|
||
```yaml | ||
--- | ||
apiVersion: trust.cert-manager.io/v1alpha1 | ||
kind: Bundle | ||
metadata: | ||
name: my-bundle | ||
spec: | ||
sources: | ||
- configMap: | ||
name: my-ca | ||
key: ca.crt | ||
target: | ||
configMap: | ||
key: ca.crt | ||
additionalFormats: | ||
jks: | ||
key: bundle.jks | ||
pkcs12: | ||
key: bundle.p12 | ||
``` | ||
|
||
Which would be migrated to: | ||
|
||
```yaml | ||
--- | ||
apiVersion: trust-manager.io/v1alpha1 | ||
kind: ClusterBundle | ||
metadata: | ||
name: my-bundle | ||
spec: | ||
sources: | ||
- configMap: | ||
name: my-ca | ||
key: ca.crt | ||
target: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I've often thought that instead of having This makes supporting future sources much easier, say for example if we wanted to add a plugin system like cert-manager has. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Interesting! Could you draw an example? 😊 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 🤦 I was thinking of sources not targets, for example: apiVersion: trust-manager.io/v1alpha1
kind: ClusterBundle
metadata:
name: my-bundle
spec:
sources:
# Using apiVersion/kind means we can utilise this same array for
# any future "extensions"/"plugins".
#
# Loading from other sources could be a valuable feature and this API change
# would enable that option in the future.
- apiVersion: v1
kind: ConfigMap
name: my-ca
key: ca.crt
... In my head it would mean that AWS for example could release an operator that exposes the That kind of support would be added in the longer term, not this piece of work, but the API change could enable that future. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good input, thanks! I suggest moving the detailed API changes discussion to #242. I have also opened #486, which will be modified to change the initially unserved For this design, I suggest we agree that we should consider changing the API. And not exactly what we want to change. Are you ok with this approach? |
||
configMap: | ||
- key: ca.crt | ||
- key: bundle.jks | ||
format: JKS | ||
- key: bundle.p12 | ||
format: PKCS12 | ||
namespaceSelector: {} # Selecting all namespaces here, but now required in API | ||
``` | ||
|
||
### Risks and Mitigations | ||
|
||
#### Target configmaps/secrets are accidentally deleted | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are two conflicting goals here:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. An alternative approach here would be to add owner references and document how to remove those owner references when you want to start using the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If "owner references" are too complex, we can also implement our own garbage collector (eg. using labels). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do you consider this as "conflicting goals"? I did not mention "do not want to break the behavior when deleting the Bundle resource" as a goal specifically, did I? 🤔 Considering the alpha state of Bundle API, I don't see a big harm in potentially orphaning some resources that can easily be deleted by the user as a workaround. The alternative could be a lot more destructive. As a user, I don't want my |
||
|
||
Since using owner references in this project, we need to be extra careful when performing changes like this. | ||
|
||
We mitigate this risk by **NOT** cascading deletes of `Bundle` to `ClusterBundle`. | ||
This is done by **NOT** adding owner references to `ClusterBundle`, and instead use the resource `name` | ||
to map between `ClusterBundle` and `Bundle` resources. | ||
|
||
Owner references on target resources are changed by the reconciler from `Bundle` to `ClusterBundle`, | ||
which means targets could only be deleted if the `ClusterBundle` is deleted or `namespaceSelector` | ||
is changed. This is the expected behavior. | ||
|
||
#### Target configmaps/secrets appearing as orphaned | ||
|
||
This could happen if a users deletes a `Bundle` resource expecting the target configmaps/secrets | ||
to vanish. Unfortunately this won't happen by the defensive approach taken to not cascade `Bundle` | ||
delete to `ClusterBundle`. | ||
|
||
In order to achieve the wanted behavior, a user would have to delete the corresponding `ClusterBundle`, | ||
which appears like a simple workaround that we should document. | ||
|
||
## Design Details | ||
|
||
### Test Plan | ||
|
||
TODO | ||
|
||
### Upgrade / Downgrade Strategy | ||
|
||
Upgrading should represent less of challenge if we ensure enough release between version X+1 and Y. | ||
We should document that upgrading directly from version X (or earlier) to Y is explicitly **NOT** supported. | ||
|
||
Downgrading to version X (or earlier) from a version where the migration has started could be problematic. | ||
We can probably mitigate this risk by keeping owner references (non-controller, as Kubernetes doesn't allow | ||
multiple controller references) from `Bundle` to target resources, in addition to the controller references from | ||
`ClusterBundle`, during the migration period. | ||
|
||
## Alternatives | ||
erikgb marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Introduce `ClusterBundle` as a new and Independent API | ||
|
||
Since a `Bundle` and a `ClusterBundle` could then address the same target resources (by name), we would have to | ||
take this into account in controllers. In addition, we would have a **lot** more code to maintain and test. | ||
This could be partly mitigated by creating reusable abstractions that work for both resources. | ||
But if we want to allow `ClusterBundle` to provide new features, this would become complex. | ||
|
||
The migration path from `Bundle` to `ClusterBundle` would also be unclear and complex. | ||
Use case: As a user, I would like to use `ClusterBundle` instead of `Bundle` **without interrupting the target resources**. | ||
|
||
``` | ||
Bundle -> ClusterBundle -> target resources | ||
``` | ||
|
||
seems a lot simpler to manage than | ||
|
||
``` | ||
Bundle -> target resources AND | ||
ClusterBundle -> target resources | ||
``` | ||
|
||
### Just rename resource between releases | ||
|
||
Since the `Bundle` API version is `v1alpha1`, we could justify just doing the simplest thing and rename. | ||
This approach could cause potentially catastrophic failures in user clusters when the `Bundle` CRD is deleted | ||
since all target configmaps/secrets are owned by bundle and would be deleted by the Kubernetes garbage collector. | ||
|
||
### Doing Nothing | ||
|
||
See "Motivation" above. | ||
|
||
## Future Work | ||
|
||
- Introduce a **namespace-scoped** `Bundle` resource. | ||
- Integration with upstream [ClusterTrustBundle API](https://kubernetes.io/docs/reference/access-authn-authz/certificate-signing-requests/#cluster-trust-bundles). | ||
- New and improved version of the `ClusterBundle` API |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to trust-manager.io from cert-manager.io?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, from
trust.cert-manager.io
totrust-manager.io
- as a grown-up. 😉