Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adding a hash function which doesnt rely on reflection for creating a… #23

Open
wants to merge 1 commit into
base: section-v1
Choose a base branch
from

Conversation

mwickman
Copy link

… hash of the nginx conf struct

I modeled this off of the function in type_equals.go which also walks the struct in order to determine equality.

@mwickman mwickman force-pushed the feature/improved-hashing branch from ae54b96 to 8876db8 Compare July 22, 2019 18:14
@mwickman
Copy link
Author

The file this is in was already created by us in a previous commit, its not part of the upstream ingress-nginx codebase

@jstangroome
Copy link

@mwickman I recommend retaining the old reflection-based hashing code either in its current location, or moving it to a *_test.go file. Regardless of location, implement a unit test that ensures the reflection-based hashing code and the non-reflection code produce the same output for a given input - this will assist in identifying fields missing from the comparison.

Additional, introduce benchmark tests demonstrating the justification of maintaining the hard-coded hashing function versus reflection-based.

Finally, check github history for commit description and PR discussions about why they chose reflection-based hashing and so we can re-open any old debates with new arguments.


writeAuthTLSConfig(l.BasicDigestAuth, hash)

writeString(*l.Denied, hash)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can segfault if Denied (pointer to string) is null (as I saw in testing earlier). The fix I implemented is in our debug branch.

@aledbf
Copy link

aledbf commented Jul 24, 2019

@mwickman are you interested in pushing this change in the upstream project?

@aledbf
Copy link

aledbf commented Jul 24, 2019

Additional, introduce benchmark tests demonstrating the justification of maintaining the hard-coded hashing function versus reflection-based.

It would be great to see this benchmark :)

Finally, check github history for commit description and PR discussions about why they chose reflection-based hashing and so we can re-open any old debates with new arguments.

No particular reason. It was the easiest solution, reusing the already available json tag in the structs.

@ivan-section-io
Copy link

@aledbf It looks like the bulk of the problem is caused by the inclusion of the original k8s api ingress object on the location structs.
https://github.com/section-io/ingress-nginx/blob/1882955/internal/ingress/types.go#L224

In our use case we have many hundreds of host names on an k8s ingress definition, the hash has been seen to take more than 15 minutes to complete (in production).
e.g. A k8s ingress object with 1000 hosts, creates 1000 server entries with 4 locations each. Those 4000 locations all include reference the original k8s ingress object with 1000 hosts+cert references. The result was 100s of millions of structs being hashed. The hash volume and time taken is O(n^2) against hosts.

Setting that single field to be ignored reduces ~10+ minutes to ~10 seconds. e.g.

Ingress *Ingress `json:"ignore"`

The existing hardcoded hash func in this PR is ~1 second (although I personally suspect much of that benefit is from skipping many members that the hash walker touches - e.g. it definitely doesn't touch the k8s ingress on location)

@aledbf
Copy link

aledbf commented Jul 25, 2019

Setting that single field to be ignored reduces ~10+ minutes to ~10 seconds. e.g.

Interesting. We have several users with more than 20000 ingresses and the issues they have are not related to the hash or reload times.

The existing hardcoded hash func in this PR is ~1 second

I am still interested in the upstream of this change :)

@ivan-section-io
Copy link

@aledbf I think this might be that we have a smaller number of ingresses with a larger number of hosts per ingress.
The O(n^2) issue only kicks in when an ingress fans out into multiple server/locations (which then all reference the larger ingress).
e.g. 1000 k8s ingress objects with a single host each wouldn't experience this issue (as the k8s api ingress reference wouldn't be duplicated so many times).

I'm very interested in users with 20000 ingresses, we're forked largely to avoid an nginx server block per hostname because of the per nginx server block memory overhead incurred (we couldn't use an nginx aliases annotation since certs wouldn't load as the cert CNs didn't match a explicit hostname).

Are these 20000 ingresses with 20000 hostnames/certs? Or 20000 locations on a limited set of hostnames? Or do they just throw memory at it?

(Actually, we've already realized that large numbers of hosts on a single ingress object is going to be a limitation, with etcd's 1MB object limits - which we'll hit at ~2000 hostnames on an ingress.)

@aledbf
Copy link

aledbf commented Jul 25, 2019

The O(n^2) issue only kicks in when an ingress fans out into multiple server/locations (which then all reference the larger ingress).

Thanks for this point.

e.g. 1000 k8s ingress objects with a single host each wouldn't experience this issue (as the k8s api ingress reference wouldn't be duplicated so many times).

I have a script that creates 5000 ingresses with different hostnames, one path and tls.
I could add another test like https://github.com/kubernetes/ingress-nginx/blob/master/test/e2e/leaks/lua_ssl.go to simulate the scenario you have

(Actually, we've already realized that large numbers of hosts on a single ingress object is going to be a limitation, with etcd's 1MB object limits - which we'll hit at ~2000 hostnames on an ingress.)

Another problem with this approach is that you cannot use annotations (they will be applied to all the hosts/paths)

Please check

@aledbf
Copy link

aledbf commented Jul 25, 2019

Or do they just throw memory at it?

What's the memory utilization you see with your use case? (nginx and go)

@mwickman
Copy link
Author

@aledbf I'm definitely interested in merging this or some similar solution upstream. I'll need just a bit of time though to prep a PR from the latest master branch, I likely won't reuse this commit.

@ElvinEfendi
Copy link

I'm very interested in users with 20000 ingresses, we're forked largely to avoid an nginx server block per hostname because of the per nginx server block memory overhead incurred (we couldn't use an nginx aliases annotation since certs wouldn't load as the cert CNs didn't match a explicit hostname).

@ivan-section-io is there an open issue describing all your requirements that are lacking in upstream ingress-nginx with a bit more details? How are you overcoming this in your fork, bundling all hosts in a single server block? Does that mean you are OK with not being able to customize settings per hostname?

we couldn't use an nginx aliases annotation since certs wouldn't load as the cert CNs didn't match a explicit hostname

Would it make sense to teach ingress-nginx to load certs based on aliases then?

I'm asking this questions to identify if it'd make sense to support this case in upstream ingress-nginx. I've also been thinking of delegating route dispatching to Lua as well which would address your case as I understand. The idea would be to post server objects to Lua land as we do for certificates and backends. But this is a big change and will mean we won't be able to use some more stock Nginx directives.

@ivan-section-io
Copy link

@ivan-section-io is there an open issue describing all your requirements that are lacking in upstream ingress-nginx with a bit more details?

@ElvinEfendi No. We've largely just tweaked for our own needs, feeling they are outside the requirements of mainstream.

How are you overcoming this in your fork, bundling all hosts in a single server block? Does that mean you are OK with not being able to customize settings per hostname?

For each hostname "set", we use a single k8s ingress definition having an IngressRule for each hostname. The paths & backends for each IngressRule are all the same. We then use a custom template function to collapse servers by a common identifier (we hijacked server.alias since it now had no other use) and a custom template that uses the function {{ $servers := .Servers | reduceByAlias }}. This approach was selected as it has a lesser chance of being impacted by ongoing internal changes (one new template function and one template call to it). https://github.com/section-io/ingress-nginx/blob/section-v1/internal/ingress/controller/template/reduce_by_alias.go#L27 It's ugly, but it works.

Does that mean you are OK with not being able to customize settings per hostname?

Yes (well, per group of hostnames). In fact, I believe we would have just used the nginx.ingress.kubernetes.io/server-alias annotation if it was also used during certificate CN checks. (In hindsight, it may have been easier. We avoided it at the time, as we didn't want to maintain a fork around any of the more volatile code areas).

Would it make sense to teach ingress-nginx to load certs based on aliases then?

I certainly believe so. But, a "full" implementation is non-trivial. nginx server_name also supports wildcards & regexes http://nginx.org/en/docs/http/ngx_http_core_module.html#server_name . A grander discussion of what nginx.ingress.kubernetes.io/server-alias intended to represent did not help us meet our immediate objectives at the time, and thus an upstreamable approach was not pursued.

I'm asking this questions to identify if it'd make sense to support this case in upstream ingress-nginx.

I believe - Yes. To me, nginx.ingress.kubernetes.io/server-alias without certificate support makes little sense. But getting such support upstream is outside my immediate purview.

@aledbf
Copy link

aledbf commented Aug 14, 2019

In fact, I believe we would have just used the nginx.ingress.kubernetes.io/server-alias annotation if it was also used during certificate CN checks. (I

This is being done here https://github.com/kubernetes/ingress-nginx/blob/f4678764f5b661b04c6678be4c2f2a13caf8710e/internal/ingress/controller/nginx.go#L1005

@ivan-section-io
Copy link

In fact, I believe we would have just used the nginx.ingress.kubernetes.io/server-alias annotation if it was also used during certificate CN checks. (I

This is being done here https://github.com/kubernetes/ingress-nginx/blob/f4678764f5b661b04c6678be4c2f2a13caf8710e/internal/ingress/controller/nginx.go#L1005

@aledbf I'm having trouble immediately seeing how that handles multiple hostnames? Where server.Alias = "a b c"

@aledbf
Copy link

aledbf commented Aug 14, 2019

I'm having trouble immediately seeing how that handles multiple hostnames?

The link is for the CN validation of the alias.

Where server.Alias = "a b c"

Adding multiple aliases should be easy and not a breaking change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

6 participants