Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update PublicSuffixMatcher to use "formal algorithm" #566

Closed

Conversation

massdosage
Copy link
Contributor

See https://issues.apache.org/jira/browse/HTTPCLIENT-2336 for context.

After porting over the tests for the formal algorithm from https://github.com/publicsuffix/list/blob/master/tests/test_psl.txt we then proceeded to make a number of changes to get them all to pass. Since this a non-backwards compatible (but hopefully more correct) change in behaviour we have introduced the concept of a "legacy mode" which could be enabled to ease a transition between behaviours but we have favoured the new behaviour and adjusted the existing tests accordingly.

There are obviously different ways we could approach this so would appreciate input and suggestions.

@@ -111,7 +111,7 @@ void testParseLocal() {

cookie.setDomain(".blah");
cookie.setAttribute(Cookie.DOMAIN_ATTR, ".blah");
Assertions.assertTrue(filter.match(cookie, new CookieOrigin("somehost.blah", 80, "/stuff", false)));
Assertions.assertFalse(filter.match(cookie, new CookieOrigin("somehost.blah", 80, "/stuff", false)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One shouldn't be allowed to set a cookie for the "blah" TLD as this is a potential security risk so we inverted the assertion as we believe the new behaviour is more secure.

@@ -258,11 +259,11 @@ void testIdentityMatching() {
Assertions.assertTrue(DefaultHostnameVerifier.matchIdentity("a.b.xxx.uk", "a.b.xxx.uk", publicSuffixMatcher));
Assertions.assertTrue(DefaultHostnameVerifier.matchIdentityStrict("a.b.xxx.uk", "a.b.xxx.uk", publicSuffixMatcher));

Assertions.assertTrue(DefaultHostnameVerifier.matchIdentity("a.b.xxx.uk", "*.b.xxx.uk", publicSuffixMatcher));
Assertions.assertTrue(DefaultHostnameVerifier.matchIdentityStrict("a.b.xxx.uk", "*.b.xxx.uk", publicSuffixMatcher));
Assertions.assertFalse(DefaultHostnameVerifier.matchIdentity("a.b.xxx.uk", "*.b.xxx.uk", publicSuffixMatcher));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a wildcard certificate for a whole public suffic ("b.xx.uk" in this case) feels like it shouldn't be allowed but we don't know enough about the rules of SSL certs to know for sure. For now we have updated the tests to pass according to the PSL algorithm but would value further input here.

Copy link
Member

@garydgregory garydgregory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if the legacy mode should be an enum instead of a boolean. This would avoid futher API changes in case there are more changes needed. Let's see what @ok2c thinks.

@ok2c
Copy link
Member

ok2c commented Aug 15, 2024

I'm wondering if the legacy mode should be an enum instead of a boolean.

@massdosage Let's simplify things. I do not think we need to preserve the old legacy behavior, especially it is wrong. Please drop the parameter and the related conditional code.

@massdosage
Copy link
Contributor Author

@garydgregory @ok2c thanks for the prompt initial review. I've removed the legacyMode parameter as requested.

@ok2c
Copy link
Member

ok2c commented Aug 16, 2024

@massdosage I am fine with your changes to the PublicSuffixMatcher class. However I do think we must preserve the original behavior of #matchIdentity method of DefaultHostnameVerifier. At the very least all tests involving amazonaws.com need to retain their original behavior as they are based on the real-life usage scenarios reported by HttpClient users.

@ok2c
Copy link
Member

ok2c commented Aug 17, 2024

@massdosage These are the additional changes that I have made on top of your pull request: ok2c/httpcomponents-client@77aefb6. Please take a look. If they look reasonable to you, I will merge your pull request and apply my changes on top of it.

@massdosage
Copy link
Contributor Author

massdosage commented Aug 19, 2024

@ok2c thanks, I took a look, we haven't used the DefaultHostnameVerifier component so I'm not entirely sure of its intended uses, especially with respect to wildcard certs which appears to be the issue here. Is the intention to implement https://datatracker.ietf.org/doc/html/rfc9525#security-wildcards and https://en.wikipedia.org/wiki/Public_key_certificate#Wildcard_certificate? If so I think the examples at https://en.wikipedia.org/wiki/Public_key_certificate#Further_examples could be good ones to aim for in the unit tests. The current examples with "ec2.compute-1.amazonaws.com" are a bit confusing as I'm pretty sure AWS would never issue a wildcard cert of that form.

@ok2c
Copy link
Member

ok2c commented Aug 19, 2024

@massdosage RFC 9525 is a new document which HttpClient does not conform to yet. What we need to ensure that HttpClient conforms to RFC2818

The current examples with "ec2.compute-1.amazonaws.com" are a bit confusing as I'm pretty sure AWS would never issue a wildcard cert of that form

But it looks like they did. Please see https://issues.apache.org/jira/browse/HTTPCLIENT-2247

@massdosage
Copy link
Contributor Author

OK, for RFC2818 it looks like one just needs to implement the wildcard at the "level" in the domain name it's specified but not any "lower". One could do that with just string matching and not even involve the PublicSuffixList. Either way, you know more about the code here and why it is the way it currently is and what future plans are so I'm happy to go with this being merged and your changes being applied afterwards if that works for you. I'm also happy to help out here and in future if there is anything related I can do.

@ok2c
Copy link
Member

ok2c commented Aug 19, 2024

@massdosage Committed as 715ec11

@ok2c ok2c closed this Aug 19, 2024
@massdosage
Copy link
Contributor Author

Great, thank you!

We'll keep an eye out for the 5.4-beta2 and 5.4 releases so we can then switch over to this and retire our internal forked version.

@garydgregory
Copy link
Member

Hello @massdosage

This PR apparently is reference on the ML at https://lists.apache.org/thread/mwhzx1n9v3317fj6x050nkozvdkszvo9 as the breaking change for local builds on my PC.

With the HEAD of git master, running:

mvn clean generate-resource 
mvn

I get:

[INFO] -------------------------------------------------------
[INFO]  T E S T S
[INFO] -------------------------------------------------------
[INFO] Running org.apache.hc.client5.http.psl.TestPublicSuffixMatcher
[ERROR] Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.944 s <<< FAILURE! - in org.apache.hc.client5.http.psl.TestPublicSuffixMatcher
[ERROR] org.apache.hc.client5.http.psl.TestPublicSuffixMatcher.testGetDomainRootPublicSuffixList  Time elapsed: 0.019 s  <<< FAILURE!
org.opentest4j.AssertionFailedError: expected: <null> but was: <uk.com>
        at org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at org.junit.jupiter.api.AssertEquals.failNotEqual(AssertEquals.java:197)
        at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:182)
        at org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:177)
        at org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:1145)
        at org.apache.hc.client5.http.psl.TestPublicSuffixMatcher.checkPublicSuffix(TestPublicSuffixMatcher.java:171)
        at org.apache.hc.client5.http.psl.TestPublicSuffixMatcher.testGetDomainRootPublicSuffixList(TestPublicSuffixMatcher.java:208)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at java.util.ArrayList.forEach(ArrayList.java:1259)
        at java.util.ArrayList.forEach(ArrayList.java:1259)

Running on:

Apache Maven 3.9.8 (36645f6c9b5079805ea5009217e36f2cffd34256)
Maven home: C:\java\apache-maven-3.9.8
Java version: 17.0.12, vendor: Eclipse Adoptium, runtime: C:\Program Files\Eclipse Adoptium\jdk-17.0.12.7-hotspot
Default locale: en_US, platform encoding: Cp1252
OS name: "windows 10", version: "10.0", arch: "amd64", family: "windows"

Please advise.

Gary

@massdosage
Copy link
Contributor Author

Let's discuss on the dev mailing list to keep things in one place if that's OK?

@garydgregory
Copy link
Member

Let's discuss on the dev mailing list to keep things in one place if that's OK?

👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants