-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Annotate Service with LB ID and use for lookup #245
Conversation
This change decouples load-balancer names from their identity, thereby permitting renames that do not impact associations to Services any longer. Instead of the load-balancer name, we use the immutable load-balancer UUID. As a beneficial side-effect, we can eliminate the local load-balancer cache since the DigitalOcean API allows for direct lookups by UUID in O(1) (vs. O(n) that we previously experienced when iterating through all load-balancers existing in a given account). Load-balancer IDs are added to their corresponding Service objects as annotations. This happens both on load-balancer creates as well as most retrieval operations (e.g., EnsureLoadBalancer) in order to annotate pre-existing Services as well. When no load-balancer ID can be found, we fall back to doing lookups by name. Additional changes this PR makes: - Add new getter method retrieveAndAnnotateLoadBalancer to retrieve a load-balancer by ID or name (based on the availability of the LB ID annotation), and patch the associated Service object with the found ID. - Use new getter method in {Get,Ensure,Update}LoadBalancer and EnsureLoadBalancerDeleted. - Exclude load-balancer activity status check from GetLoadBalancer as this is not the purpose of the method. (It is a pure getter without any logic.) - Save on API calls in EnsureLoadBalancer by leveraging the LoadBalancer value returned by the new getter method. - Improve clarity/readability of EnsureLoadBalancer by introducing a switch case statement. - Remove the load-balancer cache and reintroduce logic around name-based lookups that existed previously. - Remove the extra Kubernetes client maintained on the ResourcesController in favor of the client already managed by the resources struct. - Update the tagging service to use LB IDs for lookups. - Annotate errors. Additional test-related changes: - Extend and refactor tests. - Use bare godo.Client for tests instead of godo.NewClient. The former comes with no initialization whatsoever, thus preventing us from running against the real DigitalOcean API during tests accidentally. Additionally, we also fail hard if a used API is not stubbed out.
Another note: while the cloud provider interface mandates that Service objects should not be modified, this actually does not hold in practice anymore: it used to be true when the cloud provider framework still used UPDATE requests to add the external load-balancer IP address (ingress status) to the Service object without handling resource conflicts. These days, it employs PATCH semantics as well. My tests have shown no problems whatsoever. kubernetes/cloud-provider#27 is touching on the matter as well. (See also my comment there and the following discussion.) |
@@ -32,6 +32,10 @@ import ( | |||
) | |||
|
|||
const ( | |||
// annoDOLoadBalancerID is the annotation specifying the load-balancer ID | |||
// used to enable fast retrievals of load-balancers from the API by UUID. | |||
annoDOLoadBalancerID = "service.k8s.digitalocean.com/load-balancer-id" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use doks.digitalocean.com/load-balancer-id
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced by kubernetes.digitalocean.com/load-balancer-id
as discussed via DM to make this less specific about DOKS. (DIY clusters will use the annotation as well.)
var lb *godo.LoadBalancer | ||
lb, err = l.retrieveAndAnnotateLoadBalancer(ctx, service) | ||
switch { | ||
case err == nil: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use switch err
here.
docs/getting-started.md
Outdated
|
||
`digitalocean-cloud-controller-manager` attaches the UUID of load-balancers to the corresponding Service objects (given they are of type `LoadBalancer`) using the `service.k8s.digitalocean.com/load-balancer-id` annotation. This serves two purposes: | ||
|
||
1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can remove “ without impacting the correct working of digitalocean-cloud-controller-manager
” because ‘support’ means the same thing. Perhaps just ‘To support renaming Load Balancers.’
@@ -75,6 +75,17 @@ The primary purpose of the variable is to allow DigitalOcean customers to easily | |||
|
|||
When a cluster is created in a non-default VPC for the region, the environment variable `DO_CLUSTER_VPC_ID` must be specified or Load Balancer creation for services will fail. | |||
|
|||
### Load-balancer ID annotations | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We write it as Load Balancers in all our marketing materials, so that’s how we should write it here too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty much all references in CCM use the load-balancer style right now. If we want to adopt the marketing speech, let's do this in a separate PR.
docs/getting-started.md
Outdated
@@ -75,6 +75,17 @@ The primary purpose of the variable is to allow DigitalOcean customers to easily | |||
|
|||
When a cluster is created in a non-default VPC for the region, the environment variable `DO_CLUSTER_VPC_ID` must be specified or Load Balancer creation for services will fail. | |||
|
|||
### Load-balancer ID annotations | |||
|
|||
`digitalocean-cloud-controller-manager` attaches the UUID of load-balancers to the corresponding Service objects (given they are of type `LoadBalancer`) using the `service.k8s.digitalocean.com/load-balancer-id` annotation. This serves two purposes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider adding a version here so people don’t rename on older versions of CCM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need to address this one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
docs/getting-started.md
Outdated
1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`. | ||
2. To efficiently look up load-balancer resources in the DigitalOcean API. | ||
|
||
The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace this paragraph with
digitalocean-cloud-controller-manager
annotates new and existing Services. A Load Balancer that is renamed before the annotation is added gets lost, and a new one will be created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/added gets lost/added will be lost/
docs/getting-started.md
Outdated
|
||
The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling. | ||
|
||
It is also possible to start owning an unmanaged load-balancer by adding the annotation manually prior to setting the Service type to `LoadBalancer`. Note however every load-balancer should be owned by exactly one `digitalocean-cloud-controller-manager` instance only to prevent concurrent and presumably conflicting modifications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can import an existing Load Balancer by creating the
Service
annotated with the ID of the Load Balancer. Don’t do this while it’s being managed by another cluster, or thedigitalocean-cloud-controller-managers
will make conflicting modifications to the Load Balancer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went with your suggestion with two exceptions:
- I think "owning" is clearer than "importing"
- You can also own/import a load-balancer after the Service has been created as long as the type isn't set to
LoadBalancer
yet. I think the new wording makes this easy enough to understand now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can just call it ‘managing’ instead of owning, because that’s what you call it in the next sentence too.
You can also own/import a load-balancer after the Service has been created as long as the type isn't set to LoadBalancer yet
I think the way you’re wording this is confusing, because it makes it sounds like you need to:
- Create a Service
- Set the annotation
- Change the type to LoadBalancer
While this is possible, it’s a very roundabout way of doing it. You can just create a service of type LoadBalancer with the annotation in one go. Perhaps you could say ‘You [...] manage an existing load-balancer by creating a LoadBalancer Service annotated with the UUID of the load-balancer.’
Also or otherwise
is a tautology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted your suggestion and rephrased the second sentence to be less tautological.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just dropped "otherwise" in the last sentence.
// lbByName gets a DigitalOcean Load Balancer by name. The returned error will | ||
// be lbNotFound if the load balancer does not exist. | ||
func (l *loadBalancers) lbByName(ctx context.Context, name string) (*godo.LoadBalancer, error) { | ||
lbs, _, err := l.client.LoadBalancers.List(ctx, &godo.ListOptions{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to paginate, like allLoadBalancerList
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
Lucky for us, we already had a allLoadBalancerList
function, so I just used that.
I was a little concerned that this would cause problems with
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this looks really good. Nice work. Will 👍 once @bouk 's concerns are addressed.
id = loadBalancerIDsByName[name] | ||
} | ||
|
||
if id != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could either do an else
here or just put a continue
at the end of the if id == "" {}
block.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We cannot assume that we have an ID for sure at this point: in the case where the user renamed the LB, loadBalancerIDsByName[name]
wouldn't return anything.
I added a comment to make that clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Yeah, now I see this. Sorry for the unnecessary comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not at all -- it pointed out that the reasoning behind this wasn't clear.
docs/getting-started.md
Outdated
1. To support load-balancer renames without impacting the correct working of `digitalocean-cloud-controller-manager`. | ||
2. To efficiently look up load-balancer resources in the DigitalOcean API. | ||
|
||
The annotation is added for both newly created and existing load-balancers. However, existing load-balancers lacking the annotation yet can only be associated correctly if they were not renamed before. Otherwise, the renamed load-balancer will be considered missing and a new one will be created instead, leaving the old one dangling. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/added gets lost/added will be lost/
@@ -267,6 +267,7 @@ k8s.io/apiextensions-apiserver/pkg/features | |||
# k8s.io/apimachinery v0.0.0 => k8s.io/apimachinery v0.0.0-20190612205821-1799e75a0719 | |||
k8s.io/apimachinery/pkg/labels | |||
k8s.io/apimachinery/pkg/types | |||
k8s.io/apimachinery/pkg/util/strategicpatch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're not actually adding/removing anything from the vendor here. Why did this file change? Do we want to include these changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it changed because we started to use the referenced package as a direct dependency now, whereas it was consumed transitively only before.
@bouk @jcodybaker addressed all comments. PTAL again. |
We overwrite lb in the return call to Update() which means we cannot reference it in case of an error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment about the existing load balancer docs, but then LGTM
id = loadBalancerIDsByName[name] | ||
} | ||
|
||
if id != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. Yeah, now I see this. Sorry for the unnecessary comment?
@@ -8,6 +8,7 @@ | |||
|
|||
### Changed | |||
|
|||
* Annotate Service objects by load-balancer UUIDs to enable free LB renames and improve the DO API consumption performance (@timoreimann) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noticed this got merged into the v0.1.16
section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch -- submitted #249 to fix.
This change decouples load-balancer names from their identity, thereby permitting renames that do not impact associations to Services any longer.
Instead of the load-balancer name, we use the immutable load-balancer UUID. As a beneficial side-effect, we can eliminate the local load-balancer cache since the DigitalOcean API allows for direct lookups by UUID in O(1) (vs. O(n) that we previously experienced when iterating through all load-balancers existing in a given account).
Load-balancer IDs are added to their corresponding Service objects as annotations. This happens both on load-balancer creates as well as most retrieval operations (e.g.,
EnsureLoadBalancer
) in order to annotate pre-existing Services as well. When no load-balancer ID can be found, we fall back to doing lookups by name.Apologies for the somewhat bloated PR, which is mostly due to the test changes required by both leveraging the LB UUID as well as removing the LB cache. Unfortunately, these two should really go together. #243 would be helpful to have, but I did not want to add yet another refactoring in the same PR.
Additional changes this PR makes:
retrieveAndAnnotateLoadBalancer
to retrieve a load-balancer by ID or name (based on the availability of the LB ID annotation), and patch the associated Service object with the found ID.{Get,Ensure,Update}LoadBalancer
andEnsureLoadBalancerDeleted
.GetLoadBalancer
as this is not the purpose of the method. (It is a pure getter without any logic.)EnsureLoadBalancer
by leveraging the LoadBalancer value returned by the new getter method.EnsureLoadBalancer
by introducing a switch case statement.ResourcesController
in favor of the client already managed by theresources
struct.Additional test-related changes:
godo.Client
for tests instead ofgodo.NewClient
. The former comes with no initialization whatsoever, thus preventing us from running against the real DigitalOcean API during tests accidentally. Additionally, we also fail hard if a used API is not stubbed out.Fixes #224