-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-4974: initial proposal for deprecating v1.Endpoints #4975
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
4f695f2
to
85ba04c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some feedback
I think this makes a lot of sense, we need to find a way to move the ecosystem to EndpointSlices in a way that Endpoints will not even be required ... that I think is a good litmus test. One important thing is that we can not regress on quality, we can not drop just everything that uses Endpoints from e2e or we miss a lot of coverage so SIG Testing should review carefully the test plan cc: @BenTheElder @pohly I also think this is a new challenge in Kubernetes, as v1.Endpoints is a GA API, and some of the proposals like removing them from conformance are tricky or AFAIK we never did it, I also think we have this target of 100% API endpoints coverage that will fail if we remove v1.Endpoints of comformance ... we need SIG Architecture input here @johnbelamaric @dims Independently, definitive a topic we should work toward it, though not one of the highest priority at this point IMHO |
The KEP does not propose to demote the Endpoints API from conformance; it only proposes to demote the Endpoints controller from conformance.
I think you're referring to tests like " |
I'm generally supportive of this direction, but it does raise the question of "what even is an API"? The API would technically still be there, but this change would effectively force any system that had been reading Endpoints managed by the Endpoints controller to transition to the EndpointSlice API which does feel pretty close to a deprecation even if the API does technically still exist. As long as the EndpointSlice Mirroring controller is still running, people will still be able to create Endpoints and have them automatically translated to EndpointSlices, so this is primarily a question of systems that are reading from Endpoints. I do think that we need to have an answer for what we do if/when we migrate functionality from one v1 API to another. In practice, a rather small portion of components are still relying on the Endpoints API, and an even smaller portion are relying on the Endpoints controller. Given that most of our existing e2e test coverage is focused on functionality that is now provided by the EndpointSlice controller, it's possible that a regression in the Endpoints controller could go unnoticed for a very long time. There's also the practical matter that running the Endpoints controller is likely wasting resources on >90% of existing Kubernetes clusters. More clearly communicating that running the Endpoints controller is optional could be a good step here. To refer to the Kubernetes deprecation policy, making the Endpoints controller optional may fall under "Deprecating a feature or behavior", which has the following guidelines:
1 just requires lots of advance notice, and I think it's safe to say that the EndpointSlice controller is more widely tested and likely more reliable than the Endpoints controller at this point. |
Yeah, I think we need to give a lot of heads up, even if we're not using it in SIG-Network owned projects/tools anymore aside from these controller we don't know how users are still relying on the endpoints being populated.
But not actually documented as deprecated, right?
"Clusters with <= 1000 endpoints per service that don't use dual-stack / topology" is probably still a LOT of clusters that may be using it successfully with some other controller / tooling, even if we aren't with kube-proxy? I don't think gateway-api conformance tests are super relevant in this context?
Important distinction that still still seems like a big expectation change, I don't think we have a good precedent for this yet. This seems relevant to @kubernetes/api-approvers in addition to SIG-Arch. |
I had assumed this too, and that's true to the extent that lots of tests depend on the behavior of kube-proxy, and kube-proxy only looks at EndpointSlices. But there are still a surprising number of tests that look at Endpoints explicitly, and even some that look at Endpoints and not EndpointSlices (including, disturbingly, the dual-stack Service tests!) My plan here was to ensure that everything outside of
Right, I should have been clearer in the KEP but I was expecting this might take a while. (Antonio mentioned deprecating the in-tree cloud providers as a comparison point.) On that note, the KEP currently claims that demoting the Endpoints controller tests from conformance is part of "step 1", which I was thinking would sort of constitute the official announcement that we were doing this, but Antonio suggested that maybe it should come later in the process, so providers don't end up disabling the controller while there are still lots of third-party components that aren't ready for it.
Yeah... I had kinda been thinking it was, though it's obvious in hindsight that it's not (I look at those API docs all the time and should have noticed this).
Yes; I was thinking more in terms of "how much third-party software is still using Endpoints rather than EndpointSlices?". Even if most individual users of FooBarProxy™ have small single-stack clusters, the fact that some don't should mean that the authors would have been under pressure to port away from Endpoints for a while now.
I was mentioning it because it means we don't have to worry that there are Gateway implementations that are based on Endpoints rather than EndpointSlice. |
Nothing in our stability guarantees stops us from, eg, making and promoting a ValidatingAdmissionPolicy that blocks Endpoints creation. |
+1, I think it should look closer to cloud provider removal:
|
85ba04c
to
e731838
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall LGTM idea
because a controller _isn't_ enabled seems slightly magical? But | ||
perhaps there is precedent somewhere else?) | ||
|
||
Another possibility would be to create an `endpoints-cleanup` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer this, but it feels sort of permanent
- Update the conformance test suite to not require the Endpoints and | ||
EndpointSlice Mirroring controllers to be running, by rewriting some | ||
tests and demoting others from conformance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conformance and e2e should be behavior driven so this is ok if the behavior is covered ... IIRC most of the tests use the endpoints/endpoint slices objects to synchronize the test and avoid races and flakes because of that, so a conformance test that uses these helpers should be ok to use just EndpointSlices
- Explicitly document that disabling `endpoints-controller` and/or | ||
`endpointslice-mirroring-controller` via kube-controller-manager's | ||
`--controllers` flag is a supported and conforming configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving towards being able to disable EndpointsController by default and clean up the endpoint-controller-maintained Endpoints objects seems plausible.
I don't think I agree with taking the EndpointSlice Mirroring controller out of conformance... user-created/maintained Endpoints objects have to keep getting turned into EndpointSlice objects automatically ~forever or we'll break v1 API behavior guarantees, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What guarantee did we make?
In my mind, Endpoints might go the way of ComponentStatus: still part of the OpenAPI, but it eventually stops working how you expect, per https://kubernetes.io/docs/reference/using-api/deprecation-policy/#deprecating-a-feature-or-behavior
Related to this, we should make the docs clearer about that guarantee. I'm not sure I can tell what we're breaking here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What guarantee did we make?
writing IPs to Endpoints API objects programs the network
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
writing IPs to Endpoints API objects programs the network
I'm happy to accept that we've made this, but I don't think we've a document that shows how Endpoints is a commitment but ComponentStatus is / was not.
Something for another KEP perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, it seems is going to be complicated to justify the mirroring controller out of conformance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to make sure everything in tree is using endpoint slices and not endpoints to be able to do that, I can't remember if things like aggregated service routing or kube-apiserver service proxying switched to use endpoint slices or not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aggregated service routing
kube-apiserver service proxying
uses Endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@danwinship for reference, list of places that need migration to EndpointSlices
EDIT with the exact list
// NewEndpointServiceResolver returns a ServiceResolver that chooses one of the
// service's endpoints.
func NewEndpointServiceResolver(services listersv1.ServiceLister, endpoints listersv1.EndpointsLister) ServiceResolver {
return &aggregatorEndpointRouting{
services: services,
endpoints: endpoints,
}
}
// ResolveEndpoint returns a URL to which one can send traffic for the specified service.
func ResolveEndpoint(services listersv1.ServiceLister, endpoints listersv1.EndpointsLister, namespace, id string, port int32) (*url.URL, error) {
endpoints, err := c.endpointsLister.Endpoints(apiService.Spec.Service.Namespace).Get(apiService.Spec.Service.Name)
if apierrors.IsNotFound(err) {
availableCondition.Status = apiregistrationv1.ConditionFalse
availableCondition.Reason = "EndpointsNotFound"
availableCondition.Message = fmt.Sprintf("cannot find endpoints for service/%s in %q", apiService.Spec.Service.Name, apiService.Spec.Service.Namespace)
apiregistrationv1apihelper.SetAPIServiceCondition(apiService, availableCondition)
_, err := c.updateAPIServiceStatus(originalAPIService, apiService)
return err
EDITED
there are 3 places that resolve services using Endpoints: remote_available_controller, apiserver proxy and aggregator routing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ grep -r ResolveEndpoint staging/
Note that staging/src/k8s.io/kube-aggregator/pkg/apiserver/resolvers.go
is the only call to proxy.ResolveEndpoint
; all of the other hits are unrelated methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edited my previous comment with more accurate information
``` | ||
<<[UNRESOLVED disable-by-default ]>> | ||
|
||
- MAYBE eventually move `endpoints-controller` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with moving towards being able to turn off Endpoints controller by default. I don't see how we'd ever turn off EndpointsMirroring controller by default without it being a hard break of behavior tied to a v1 API.
Have we ever made a commitment such as: “any feature that is part of conformance will remain part of conformance for that major version”? (maybe we should; we can still deprecate Endpoints even if we do commit to that) |
Did you mean "minor version" there? (eg, "33", not "1") Either way, the current full rules for demotion from conformance are https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/conformance-tests.md#demoting-conformance-tests. |
I meant “major version”; some commitments, we won't break unless there is a Kubernetes 2.x (sounds fun, right?) |
OK, so it sounds like we can deprecate Endpoints and we can remove the implementation where writing Endpoints has an effect. Have I got that right? I think everyone agrees we can deprecate Endpoints (did we already do that?) @danwinship maybe this makes sense to split into two KEPs:
See #4975 (comment) |
I see three bits:
1 is easily in bounds and is primarily a documentation change / warning. 2 could be in bounds, and I'm in favor of the direction, but needs to be done carefully. I have a hard time seeing us doing 3 while saying we're staying compatible with integrations writing to the v1 API surface. |
OK, let's make that the split point; that can be its own KEP (I propose) and the discussion may help us document the tacit commitments we already made. The previous two points can come as part of the deprecation of Endpoints; not honoring desired state is what we split out. And of course we don't remove the Endpoints API whilst the version of K8s remains 1.x How does that sound? |
Note that while there has been a bunch of discussion of disabling the controllers by default (or even removing them), the KEP itself does not currently propose to do that, and explicitly suggests that that would probably require a second KEP. |
e731838
to
59e390a
Compare
#4974
/sig network
/cc @aojea @robscott