[ 
https://issues.apache.org/jira/browse/HTTPCORE-739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17708998#comment-17708998
 ] 

Julian Reschke commented on HTTPCORE-739:
-----------------------------------------

The URI spec does not define the structure of the query part. Thus, an API that 
parses query parameters needs to look elsewhere. In this case, this 
(unfortunately) is the WhatWG URL spec 
(https://url.spec.whatwg.org/#application/x-www-form-urlencoded), and that 
indeed defines the handling of "+".



> org.apache.hc.core5.net.URIBuilder does not decode plus characters (`+`) in 
> the query part
> ------------------------------------------------------------------------------------------
>
>                 Key: HTTPCORE-739
>                 URL: https://issues.apache.org/jira/browse/HTTPCORE-739
>             Project: HttpComponents HttpCore
>          Issue Type: Bug
>          Components: HttpCore
>    Affects Versions: 5.2.1
>            Reporter: Andreas Loth
>            Priority: Major
>
> Currently, when decoding the query part of an URL, a plus sign is kept als 
> plus sign in the decoded name-value-pairs.
> Expected would be that a plus sign is decoded to a space.
> https://www.w3.org/Addressing/URL/uri-spec.html
> > Within the query string, the plus sign is reserved as shorthand notation 
> > for a space. Therefore, real plus signs must be encoded.
> I'm perfectly fine with encoding space everywhere to %20 and the plus sign 
> everywhere to %2B (this is in my experience the most unambiguous and less 
> error prone way to handle these characters). See HTTPCORE-628
> However, during decoding the position is the plus sign has to be respected: 
> decode it to space in the query part but leave it as plus everywhere else.
> Test case for decoding:
> {noformat}
>  * URL: 
> https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test
>  * path: /abc/plus-+_enc-space- _enc-plus-+_/def
>  * get argument 1 name: test
>  * get argument 1 value: plus- _enc-space- _enc-plus-+_
>  * get argument 2 name: plus- _enc-space- _enc-plus-+_
>  * get argument 2 value: test
> {noformat}
> Test case for encoding:
> {noformat}
>  * path: /abc/plus-+_space- _/def
>  * get argument 1 name: test
>  * get argument 1 value: plus-+_space- _
>  * get argument 2 name: plus-+_space- _
>  * get argument 2 value: test
>  * URL: 
> https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test
> {noformat}
> Potential fix (untested):
> https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410
> Change the vaue of the `plusAsBlank` argument from `false` to `true`.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org
For additional commands, e-mail: dev-h...@hc.apache.org

Reply via email to