Hi Tim & all,

> On Mar 21, 2025, at 06:22, Tim Düsterhus <t...@bastelstu.be> wrote:
> 
> Am 2025-03-18 18:48, schrieb Paul M. Jones:
>> $iriPath = '/heads/' . rawurlencode($val) . '/tails/');
>> assert($iriPath === '/heads/fü bar/tails/'; // false
> 
> From my reading of RFC 3987 that result is incorrect. The space is neither 
> listed as `iunreserved`, not as `sub-delims`, thus isn't a valid `ipchar`. 
> Thus the space needs to be encoded as %20 for IRIs as well. The same mistake 
> applies to the reference userland implementation below.
Agreed; the naive implementation would need to less naive and pay closer 
attention to the ABNF for `ucschar` and `ipchar` in the spec.

Along those lines, I think there might need to be two additional 
changes/additions to help with encoding for RFC 3987 and WHATWG-URL component 
values:

- `http_build_query()` would need PHP_QUERY_3987 and PHP_QUERY_WHATWG flags and 
corresponding logic (or entirely new functions); and
- `parse_str()` would need a corresponding `mb_parse_str()`.


-- pmj

Reply via email to