[ https://issues.apache.org/jira/browse/HTTPCORE-739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Loth updated HTTPCORE-739: ---------------------------------- Description: Currently, when decoding the query part of an URL, a plus sign is kept als plus sign in the decoded name-value-pairs. Expected would be that a plus sign is decoded to a space. https://www.w3.org/Addressing/URL/uri-spec.html > Within the query string, the plus sign is reserved as shorthand notation for > a space. Therefore, real plus signs must be encoded. I'm perfectly fine with encoding space everywhere to %20 and the plus sign everywhere to %2B (this is in my experience the most unambiguous and less error prone way to handle these characters). See HTTPCORE-628 However, during decoding the position is the plus sign has to be respected: decode it to space in the query part but leave it as plus everywhere else. Test case for decoding: {noformat} * URL: https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test * path: /abc/plus-+_enc-space- _enc-plus-+_/def * get argument 1 name: test * get argument 1 value: plus- _enc-space- _enc-plus-+_ * get argument 2 name: plus- _enc-space- _enc-plus-+_ * get argument 2 value: test {noformat} Test case for encoding: {noformat} * path: /abc/plus-+_space- _/def * get argument 1 name: test * get argument 1 value: plus-+_space- _ * get argument 2 name: plus-+_space- _ * get argument 2 value: test * URL: https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test {noformat} Potential fix (untested): https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410 Change the vaue of the `plusAsBlank` argument from `false` to `true`. was: Currently, when decoding the query part of an URL, a plus sign is kept als plus sign in the decoded name-value-pairs. Expected would be that a plus sign is decoded to a space. https://www.w3.org/Addressing/URL/uri-spec.html > Within the query string, the plus sign is reserved as shorthand notation for > a space. Therefore, real plus signs must be encoded. I'm perfectly fine with encoding space everywhere to %20 and the plus sign everywhere to %2B (this is in my experience the most unambiguous and less error prone way to handle these characters). See HTTPCORE-628 However, during decoding the position is the plus sign has to be respected: decode it to space in the query part but leave it as plus everywhere else. Test case for decoding: * URL: https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test * path: /abc/plus-+_enc-space- _enc-plus-+_/def * get argument 1 name: test * get argument 1 value: plus- _enc-space- _enc-plus-+_ * get argument 2 name: plus- _enc-space- _enc-plus-+_ * get argument 2 value: test Test case for encoding: * path: /abc/plus-+_space- _/def * get argument 1 name: test * get argument 1 value: plus-+_space- _ * get argument 2 name: plus-+_space- _ * get argument 2 value: test * URL: https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test Potential fix (untested): https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410 Change the vaue of the `plusAsBlank` argument from `false` to `true`. > org.apache.hc.core5.net.URIBuilder does not decode plus characters (`+`) in > the query part > ------------------------------------------------------------------------------------------ > > Key: HTTPCORE-739 > URL: https://issues.apache.org/jira/browse/HTTPCORE-739 > Project: HttpComponents HttpCore > Issue Type: Bug > Components: HttpCore > Affects Versions: 5.2.1 > Reporter: Andreas Loth > Priority: Major > > Currently, when decoding the query part of an URL, a plus sign is kept als > plus sign in the decoded name-value-pairs. > Expected would be that a plus sign is decoded to a space. > https://www.w3.org/Addressing/URL/uri-spec.html > > Within the query string, the plus sign is reserved as shorthand notation > > for a space. Therefore, real plus signs must be encoded. > I'm perfectly fine with encoding space everywhere to %20 and the plus sign > everywhere to %2B (this is in my experience the most unambiguous and less > error prone way to handle these characters). See HTTPCORE-628 > However, during decoding the position is the plus sign has to be respected: > decode it to space in the query part but leave it as plus everywhere else. > Test case for decoding: > {noformat} > * URL: > https://example.org/abc/plus-+_enc-space-%20_enc-plus-%2B_/def?test=plus-+_enc-space-%20_enc-plus-%2B_&plus-+_enc-space-%20_enc-plus-%2B_=test > * path: /abc/plus-+_enc-space- _enc-plus-+_/def > * get argument 1 name: test > * get argument 1 value: plus- _enc-space- _enc-plus-+_ > * get argument 2 name: plus- _enc-space- _enc-plus-+_ > * get argument 2 value: test > {noformat} > Test case for encoding: > {noformat} > * path: /abc/plus-+_space- _/def > * get argument 1 name: test > * get argument 1 value: plus-+_space- _ > * get argument 2 name: plus-+_space- _ > * get argument 2 value: test > * URL: > https://example.org/abc/plus-%2B_space-%20_/def?test=plus-%2B_space-%20_&plus-%2B_space-%20_=test > {noformat} > Potential fix (untested): > https://github.com/apache/httpcomponents-core/blob/86ccd9b58ecc39ac5496af012a5decb33203ea1e/httpcore5/src/main/java/org/apache/hc/core5/net/URIBuilder.java#L410 > Change the vaue of the `plusAsBlank` argument from `false` to `true`. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org