massdosage commented on code in PR #629:
URL: 
https://github.com/apache/httpcomponents-client/pull/629#discussion_r2021437718


##########
httpclient5/src/test/java/org/apache/hc/client5/http/psl/TestPublicSuffixMatcher.java:
##########
@@ -284,14 +284,14 @@ void testGetDomainRootPublicSuffixList() {
         checkPublicSuffix("shishi.中国", "shishi.中国");
         checkPublicSuffix("中国", null);
         // Same as above, but punycoded.
-        checkPublicSuffix("xn--85x722f.com.cn", "xn--85x722f.com.cn");
-        checkPublicSuffix("xn--85x722f.xn--55qx5d.cn", 
"xn--85x722f.xn--55qx5d.cn");
-        checkPublicSuffix("www.xn--85x722f.xn--55qx5d.cn", 
"xn--85x722f.xn--55qx5d.cn");
-        checkPublicSuffix("shishi.xn--55qx5d.cn", "shishi.xn--55qx5d.cn");
+        checkPublicSuffix("xn--85x722f.Com.Cn", "食狮.com.cn");
+        checkPublicSuffix("xn--85x722f.xn--55qx5d.CN", "食狮.公司.cn");
+        checkPublicSuffix("www.xn--85x722f.xn--55qx5d.cn", "食狮.公司.cn");
+        checkPublicSuffix("shishi.xn--55qx5d.cn", "shishi.公司.cn");
         checkPublicSuffix("xn--55qx5d.cn", null);
-        checkPublicSuffix("xn--85x722f.xn--fiqs8s", "xn--85x722f.xn--fiqs8s");
-        checkPublicSuffix("www.xn--85x722f.xn--fiqs8s", 
"xn--85x722f.xn--fiqs8s");
-        checkPublicSuffix("shishi.xn--fiqs8s", "shishi.xn--fiqs8s");
+        checkPublicSuffix("xn--85x722f.xn--fiqs8s", "食狮.中国");
+        checkPublicSuffix("www.xn--85x722f.xn--fiqs8s", "食狮.中国");
+        checkPublicSuffix("shishi.xn--fiqs8s", "shishi.中国");

Review Comment:
   I don't see anywhere in the standard that says if one pass Punycode in one 
should expect to get Unicode out. The line you quote above I think comes from 
the "[Entry 
Specification](https://github.com/publicsuffix/list/wiki/Format#entry-specification)"
  which defines the layout of the PSL _file_, not how it behaves. The Algorithm 
is defined later on in that page under 
https://github.com/publicsuffix/list/wiki/Format#algorithm but it doesn't 
specifically call Punycode out.
   
   My understanding of the set of unit tests that they provide is exactly to 
avoid inconsistency, if one's implementation behaves the same as theirs and has 
the same results as the unit tests then it's correct. We can see they 
purposefully added this behaviour a long time ago via this commit 
https://github.com/publicsuffix/list/commit/ddc97474bc8d0de6b70de6ac37125a371e6df439#diff-7ff3771a2abbfd9f8dfc636e6fd2ba9ebb72f59f791ed6df380066c66a9f4179R28.
 There is a comment there that says "The EffectiveTLDService always gives back 
punycoded labels." which is the behaviour we see in the unit tests. 
   
   I agree that it's not all very clear but I take the fact that they provide a 
set of tests which they run against incoming contributions to be what they 
consider "correct" and anything claiming to implement the standard should 
behave the same way for the same input.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to