[
https://issues.apache.org/jira/browse/HTTPCLIENT-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17882100#comment-17882100
]
Julian Reschke edited comment on HTTPCLIENT-2341 at 9/16/24 4:08 PM:
---------------------------------------------------------------------
It is *not* a special component inside the path component of an hierarchical
URI (such as in the "http" and "http" schemes).
Consumers that treat "@" differently from "%40" can do that; it just means that
they treat HTTP(s) URIs that are equivalent based on "Scheme-Based
Normalization" as non-equivalent. Depending on the use case that can be by
design, or it could be a very bad idea (for instance, when access control
layers have a different opinion about what is equivalent than other parts of
the stack).
was (Author: reschke):
It is *not* a special component inside the path component of an hierarchical
URI (such as in the "http" and "http" schemes).
Consumers that treat "@" differently from "%40" can do that; it just means that
they treat HTTP(s) URIs that are equivalent based on "Scheme-Based
Normalization" as non-equivalent. Depending on the use case that can be by
design, or it could be a very bad idea (for instance, when access control
layers have a different about what is equivalent than other parts of the stack).
> DefaultRedirectStrategy breaks reserved chars in URI path
> ---------------------------------------------------------
>
> Key: HTTPCLIENT-2341
> URL: https://issues.apache.org/jira/browse/HTTPCLIENT-2341
> Project: HttpComponents HttpClient
> Issue Type: Bug
> Components: HttpClient (classic)
> Affects Versions: 4.5.14
> Environment: httpclient4 (4.5.14)
> Linux/Ubuntu 22.04
> Reporter: Xavier BOURGOUIN
> Priority: Major
> Attachments: hc4normalize.tar.gz
>
>
> When an HTTP response has an URI in the Location header with percent-encoded
> reserved chars (such as %40), these chars are replaced by their normalized
> equivalent (which is "@" in the case of %40), which seems to contradict RFC
> 3986 ([https://www.rfc-editor.org/rfc/rfc3986#section-2.2] ), at least in the
> sense that for such reserved characters, their percent-encoded value doesn't
> have the same semantic meaning and thus aren't to be interpreted as
> equivalent.
> One of the impacts is that it breaks any server / API that redirect clients
> to a S3 blob object (AWS S3 for instance) that would happen to contain a %40
> in the URI path (ex: location: https://<endpoint>/<some blob
> container>/foo%40bar.file)
> Disabling URI normalization as show below seems to workaround it:
> {code:java}
> new
> HttpGet("http://service-that-redirects").setConfig(RequestConfig.custom().setNormalizeUri(false).build())
> {code}
> However I'm not sure that's satisfying, if, as we suspect above, it is just
> always wrong to "normalize" those reserved characters (plus it is enabled by
> default).
> Note that httpclient5 is fine (the percent-encoded %40 is preserved as it
> should, and it seems there's no more toggle for the normalization behavior
> anyways).
> Comparing httpclient 4.x vs 5.x, it seems the URI normalization utility isn't
> the same, which might explain why httpclient5 has no issue:
> [https://github.com/apache/httpcomponents-client/blob/4.5.x/httpclient/src/main/java/org/apache/http/impl/client/DefaultRedirectStrategy.java#L163]
> [https://github.com/apache/httpcomponents-client/blob/5.3.x/httpclient5/src/main/java/org/apache/hc/client5/http/impl/DefaultRedirectStrategy.java#L116]
> (org.apache.http.client.utils.URIUtils.normalize() for HC4, versus
> java.net.URI.normalize for HC5)
>
> This past ticket https://issues.apache.org/jira/browse/HTTPCLIENT-2271 was
> discussing something very similar, except it was the other way around: some
> reserved characters were replaced by their percent-encoded equivalent.
> However in the the lengthy comment thread there, it seems a consensus was
> finally reach that for such chars, their percent-encoded value aren't
> equivalent to their original value and thus shouldn't be transformed. So I
> believe that reasoning should be bijective, and should also apply to the case
> reported here.
> I worked out a reproducer in the form of a little maven project that I'm
> attaching to this ticket, inspired from the one of that other ticket, that
> demo the issue for httpclient 4.5.14 (but probably all 4.x is the same), and
> compares it with httpclient5 (5.3.1). It should run directly with _mvn
> exec:java_ and hopefully the output and code content are clear enough to be
> self-explanatory.
>
> In essence what it does is :
> * Start a dummy http server with two services: */foo* that redirect to
> */foo%40bar* and one that listen on */foo@bar* and reply with HTTP 200.
> * Test httpclient4 (along with some other clients to demonstrate the
> differences in behavior) by sending some GET request toward */foo* and
> observe if and how it follows the redirect toward {*}/foo@bar{*}, which thus
> allows to observe whether *%40* was replaced by *@*
>
> {code:java}
> // Dummy server
> public static void main(String[] args) throws IOException,
> InterruptedException {
> HttpServer server = HttpServer.create(new InetSocketAddress(8000), 0);
> server.createContext("/foo", new RedirectHttpHandler());
> server.createContext("/foo@bar", new SuccessHttpHandler());
> server.setExecutor(null);
> server.start();
> server.stop(0);
>
> // [... test client requets]
> }
> public static class RedirectHttpHandler implements HttpHandler {
> @Override
> public void handle(HttpExchange t) throws IOException {
> t.getResponseHeaders().add("Location", "/foo%40bar");
> t.sendResponseHeaders(302, 0);
> OutputStream os = t.getResponseBody();
> os.close();
> }
> }
>
> public static class SuccessHttpHandler implements HttpHandler {
> @Override
> public void handle(HttpExchange t) throws IOException {
> System.out.println("[server] Received GET with URI: " +
> t.getRequestURI().toString());
> String response = "You followed the redirect!";
> t.sendResponseHeaders(200, response.length());
> OutputStream os = t.getResponseBody();
> os.write(response.getBytes());
> os.close();
> }
> }
> {code}
> And httpclient4 test like this:
> {code:java}
> CloseableHttpClient client = HttpClients.createDefault();
> HttpGet httpget = new HttpGet("http://127.0.0.1:8000/foo");
> CloseableHttpResponse response = client.execute(httpget);
> if (response.getStatusLine().getStatusCode() == 302) {
> System.out.println("-> Location header: " +
> response.getFirstHeader("Location").getValue());
> } else if (response.getStatusLine().getStatusCode() == 200) {
> System.out.println("-> Followed the redirect!");
> } else {
> throw new RuntimeException("Unexpected response code: " +
> response.getStatusLine().getStatusCode());
> }
> {code}
>
>
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]