[ 
https://issues.apache.org/jira/browse/HTTPCLIENT-1995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16871179#comment-16871179
 ] 

Hartmut Arlt edited comment on HTTPCLIENT-1995 at 6/24/19 12:45 PM:
--------------------------------------------------------------------

[~olegk] - See Page 11 and ff. Again, Java (or Oracle) is not to blame as this 
non-equivalent normalization is yours. 
 reserved = gen-delims / sub-delims

gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@"

sub-delims = "!" / "$" / "&" / "'" / "(" / ")"
 / "*" / "+" / "," / ";" / "="

The purpose of reserved characters is to provide a set of delimiting
 characters that are distinguishable from other data within a URI.
 URIs that differ in the replacement of a reserved character with its
 corresponding percent-encoded octet are not equivalent. *Percent-*
 *encoding a reserved character, or decoding a percent-encoded octet*
 *that corresponds to a reserved character, will change how the URI is*
 *interpreted by most application*s. Thus, characters in the reserved
 set are protected from normalization and are therefore safe to be
 used by scheme-specific and producer-specific algorithms for
 delimiting data subcomponents within a URI.


was (Author: harlt):
[~olegk] - See Page 11 and ff. Again, Java (or Oracle) is not to blame as this 
non-equivalent normalization is yours. 
       reserved    = gen-delims / sub-delims

      gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

      sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
                  / "*" / "+" / "," / ";" / "="

   The purpose of reserved characters is to provide a set of delimiting
   characters that are distinguishable from other data within a URI.
   URIs that differ in the replacement of a reserved character with its
   corresponding percent-encoded octet are not equivalent.  Percent-
   encoding a reserved character, or decoding a percent-encoded octet
   that corresponds to a reserved character, will change how the URI is
   interpreted by most applications.  Thus, characters in the reserved
   set are protected from normalization and are therefore safe to be
   used by scheme-specific and producer-specific algorithms for
   delimiting data subcomponents within a URI.

> Percent-encoded ampersand in URI path not preserved
> ---------------------------------------------------
>
>                 Key: HTTPCLIENT-1995
>                 URL: https://issues.apache.org/jira/browse/HTTPCLIENT-1995
>             Project: HttpComponents HttpClient
>          Issue Type: Bug
>          Components: HttpClient (classic)
>    Affects Versions: 4.5.8, 4.5.9
>         Environment: Linux Mint 19, OpenJDK 8
>            Reporter: none_
>            Priority: Major
>
> Starting with HttpClient 4.5.8, percent-encoded ampersand characters in URI 
> path segments are not preserved any longer but written in decoded form to 
> wire due to path normalization performed by URIUtils.rewriteURI(URI, 
> HttpHost).
>  
> According to RFC 3986 (page 11+), the ampersand character is a delimiter and 
> thus needs to be percent-encoded when not used for this purpose. Path 
> normalization, as performed by HttpClient v4.5.8+, creates a new URI that is 
> not equivalent to the original URI and thus leads to misinterpretation on 
> server/receiver side.
> ??URIs that differ in the replacement of a reserved character with its??
> ??corresponding percent-encoded octet are not equivalent. Percent-??
> ??encoding a reserved character, or decoding a percent-encoded octet??
> ??that corresponds to a reserved character, will change how the URI is??
> ??interpreted by most applications??.
>   
> A very simple test case is as follows:
> {code:java}
> @Test
> public void testAmpersand() throws Throwable
> {
>     final URI uri = new 
> URI("http://example.org/some/path%26with%20percent/encoded/segments";);
>     final URI uri2 = URIUtils.rewriteURI(uri, null);
>         
>     Assert.assertEquals(uri, uri2);
> }
> {code}
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to