Hi
Am 2025-04-17 13:18, schrieb Máté Kocsis:
Sweet. I believe this was/is the last remaining blocker for the RFC or
is there still anyone else from your side that needs to be discussed?
I
need to give the RFC another read once you made the adjustment to
remove
the WhatWg raw methods (and adjusted the corresponding explanations),
but I think I'm happy then :-)
No, I also think that was the last one, as I don't have any questions
left.
Although,
we should finalize what the WHATWG getters should be named? I like the
explicit "raw"
that you suggested, but I can also see that it may be confusing for
some
people. Altogether
I think I prefer adding "raw" so that it's clear that they behave
similarly
how the raw RFC 3986 getters
do.
In https://news-web.php.net/php.internals/127114 I suggest to only
provide the "non-raw" methods, so I believe you misread that. I've just
given the RFC another read and thought about the naming and I believe I
still prefer not having the "raw" in the name:
- Having the `raw` in the name makes the API very clunky / verbose to
use.
- Other implementations, such as in browsers or node.js, also simply use
the component name without any indication of the output being raw.
- Future changes to the WHATWG URL specification might introduce some
normalization for components that currently doesn't have normalization.
This would make the `raw` naming a misnomer and might require new
methods / deprecations on PHP's end.
So it seems to be safer to use the naming without the `raw` and then in
the documentation explain what happens with useful examples, just like
the RFC already does.
------------
Other than that, I noticed the following small issues:
1.
The `UrlValidationError` class is `final` in the implementation, but not
in the RFC text. I assume that is an oversight.
2.
In the "Advanced examples" section, the "another tricky example". There
is a duplicate `?foo=bar%26baz%3Dqux` in the query-string. I assume that
is unintentional and not part of the example.
3.
In the "Advanced examples" section, the "another tricky example". I
think it would be useful to have an explicit comparison to the output of
the WHATWG URL, especially around the IPv6 normalization. I've seen that
this is also mentioned later, but it's probably useful to have here as
well.
4.
In the "Component modification" section, for the "In order to offer
consistent behavior with the parsing rules of RFC 3986, withers of
Uri\Rfc3986\Uri also only accept properly formatted input," example:
There is a `echo $uri->getRawHost(); //
[2001:0db8:0001:0000:0000:0ab9:C0A8:0102]` call, but the host is never
modified. That appears to be an error.
5.
In the "Serialization" section: The explanation of the serialization
format is overly specific regarding the implementation details. I would
simplify that to just say "it supports serialization by using the
toRawString() output and performs strict checks during unserialization"
or similar. The reason is that I want to make some suggestions to the
serialization format to provide greater flexibility for future changes
during the technical review of the implementation :-)
------------
I did not give the implementation another test, since with the removal
of the percent-decoding for WHATWG, the RFC just does what the other
specifications already require. So this all makes sense to me and any
differences would simply be a regular bug in the code, rather than the
RFC text.
Best regards
Tim Düsterhus