arturobernalg commented on code in PR #517: URL: https://github.com/apache/httpcomponents-core/pull/517#discussion_r1996253015
########## httpcore5/src/main/java/org/apache/hc/core5/net/PercentCodec.java: ########## @@ -113,6 +113,37 @@ public class PercentCodec { RFC5987_UNRESERVED.set('~'); } + static final BitSet PCHAR = new BitSet(256); + static final BitSet USERINFO = new BitSet(256); + static final BitSet REG_NAME = new BitSet(256); + static final BitSet PATH_SEGMENT = new BitSet(256); + static final BitSet QUERY = new BitSet(256); + static final BitSet FRAGMENT = new BitSet(256); + + static { + PCHAR.or(UNRESERVED); + PCHAR.or(SUB_DELIMS); + PCHAR.set(':'); + PCHAR.set('@'); + USERINFO.or(UNRESERVED); + USERINFO.or(SUB_DELIMS); + USERINFO.set(':'); + REG_NAME.or(UNRESERVED); + REG_NAME.or(SUB_DELIMS); + REG_NAME.clear('!'); + PATH_SEGMENT.or(PCHAR); + QUERY.or(PCHAR); + QUERY.set('/'); + QUERY.set('?'); + FRAGMENT.or(PCHAR); + FRAGMENT.set('/'); + FRAGMENT.set('?'); + // Some sub-delims remain encoded (RFC 3986 allows them unencoded, but we choose to be strict). + PATH_SEGMENT.clear('('); + PATH_SEGMENT.clear(')'); + PATH_SEGMENT.clear('&'); + } Review Comment: > @arturobernalg I cannot find anything in the RFC stating that. What am I missing? What a super strict mode we have `ALL_RESERVED` @olegk The test expects `(`, `)`, `&` to remain percent-encoded in path segments (e.g., `blah-%28%20-blah-%20%26%20-blah-%20%29-blah/`), but without the `PATH_SEGMENT.clear('(')`, `clear(')')`, `clear('&')` in `PercentCodec.java`, it outputs `blah-(%20-blah-%20&%20-blah-%20)-blah/`. Reintroducing the `clear()` calls fixes it, but I’m wondering if this aligns with our intended behavior or if we should adjust `formatPath` to handle encoding differently based on `encodingPolicy`. I noticed the older `formatPath` used `PercentCodec.encode(buf, segment, charset, false)` with `UNRESERVED`, which worked, but the new version with `encodingPolicy` support might need tweaking. Could you take a look and advise? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@hc.apache.org For additional commands, e-mail: dev-h...@hc.apache.org