Hi Gina,

1.
> The paragraph in at the beginning of the RFC in the > Relevant URI
> specifications > WHATWG URL section seems to be incomplete.
>

Hopefully it's good now. Although I know this section doesn't include much
information.

2.
> I don't really understand how the UninitializedUriException exception can
> be thrown?
> Is it somehow possible to create an instance of a URI without initializing
> it?
> This seems unwise in general.
>

I think I've already answered this since then in my previous email (and in
the RFC as well), but yes, it's possible via reflection.
I don't really have an idea how this possibility could be avoided without
also making the classes final.


>
> 3.
> I'm not really convinced by using the constructor to be able to create a
> URI object.
> I think it would be better for it to be private/throwing and have two
> static constructor `parse` and `tryParse`,
> mimicking the API that exists for creating an instance of a backed enum
> from a scalar.
>

I'm not completely against using parse() and tryParse(), but I think the
constructor already makes it clear that it either returns
a valid object or throws.

>
> 4.
> I think changing the name of the `toString` method to `toRawString` better
> matches the rest of the proposed API,
> and also removes the question as to why it isn't the magic method
> `__toString`.
>

For RFC 3986, we could go with toString() instead of toNormalizedString()
and toRawString() instead of toString() so that we use
the same convention as for getters.

Recently I learnt that for some reason WHATWG normalizes the IP address
*during* component recomposition, so its toString() is
not really the most rare (at least not in the same way as "raw getters"
are). So for WHATWG, I think keeping toString() and
toDisplayString() probably still makes sense.


>
> 5.
> I will echo Tim's concerns about the non-final-ity of the URI classes.
> This seems like a recipe for disaster.
> I can _maybe_ see the usefulness of extending Rfc3986\Uri by a subclass
> Ldap\Uri,
> but being able to extend the WhatWg URI makes absolutely no sense.
> The point of these classes is that if you have an instance of one of
> these, you *know* that you have a valid URI.
> Being able to subclass a URI and mess with the `equals`, `toString`,
> `toNormalizedString` methods throws away all the safety guarantees provided
> by ***possessing*** a Uri instance.
>

I'm sure that people will find their use-cases to subclass all these new
classes, including the WHATWG implementation. As Nicolas mentioned,
his main use-case is minly adding convenience and new factory methods that
don't specifically need all methods to be reimplemented.

While I share your opinion that leaving the URI classes open for extension
is somewhat risky and it's difficult to assess its impacts right now, I can
also
sympathise with what Nicolas wrote in a later message (
https://externals.io/message/123997#126489): that we shouldn't close the
door for the public from
using interchangeable implementations.

I know that going final without any interfaces is the most "convenient" for
the PHP project itself, because the solution has much less BC surface to
maintain,
so we are relatively free and safe to make future changes. This is useful
for an API in its early days that is huge like this. Besides the interests
of the maintainers,
we should also take two important things into account:

- Heterogeneous use-cases: it's out of question that the current API won't
fit all use-cases, especially because we have already identified some
followup tasks
that should be implemented (see "Future Scope" section in the RFC).
- Interoperability: Since URI handling is a very widespread problem, many
people and libraries will start to use the new extension once
it's available. But because
of the above reason, many of them want to use their own abstraction, and
that's exactly why a common ground is needed: there's simply not a single
right possible
implementation - everyone has their own, given the complexity of the topic.

So we should try to be considerate about these factors by some way or
another. So far, we have four options:

- Making the classes open for extension: this solution has acknowledged
technical challenges (
https://github.com/php/php-src/pull/14461#discussion_r1847316607),
and it limits our possibilities of adding changes the most, but users can
effectively add any behavior that they need. Of course, they are free to
introduce bugs and
spec-incompatible behavior into their own implementation, but none of the
other solutions could prevent such bugs either, since people will write
their custom code
wherever they can: if they can't have it in a child class, then they will
have in MyUri, or in UriHelper, or just in a 200 lines long function.

Being able to extend the built-in classes also means that child classes can
use the behavior of their parent by default - there's no need to create
wrapper
classes around the built-in ones (aka using composition), that is a tedious
task to implement, and also which would incur some performance penalty
because of the
extra method calls.

- Making the classes open for extension, but making some methods final:
same benefits as above, without the said technical challenges - in theory.
I am currently
trying to figure out if there is a combination of methods that could be
made final so that the known challenges become impossible to be triggered -
although I haven't
managed to come up with a sensible solution yet.

- Making the classes final: It avoids some edge-cases for the built-in
classes (the uninitialized state most prominently), while it leaves the
most room for making future
changes. Projects that may want to ship their own abstractions for the two
built-in classes can use composition to create their own URI
implementations.
They can instantiate these implementations however they want to (i.e.
$myUri = new MyUri($uri)). If they need to pass an URI to other libraries
then they could extract
the wrapped built-in class (i.e. $myUri->getUri()).

On the flipside, backporting methods added in future PHP versions (aka
polyfills) will become impossible to implement for URIs according to my
knowledge, as well as mocking
in PHPUnit will also be a lost feature (I'm not sure if it's a good or a
bad thing, but it may be worth to point out).

Also, the current built-in implementations may have alternative
implementations that couldn't be used instead of them. For example, the ADA
URL library (which is mentioned
in the RFC) also implements the WHATWG specification - possibly the very
same way as Lexbor, the currently used library - does. These alternative
implementations may have
different performance characteristics, platform requirements, or level of
maintenance/support, which may qualify them as more suitable for some
use-cases than what the built-in
ones can offer. If we make these classes final, there's no way to use
alternative implementations as a replacement for the default ones, although
they all implement the same
specification having mostly clear semantics.

- Making the classes final, but adding a separate interface for each: The
impact of making the built-in classes final would be mitigated by adding
one interface
for each specification (I didn't like this idea in the past, but it now
looks much more useful in the perspective of the final vs non-final
debate). Because of the interfaces,
there would be a common denominator for the different possible
implementations. I'm sure that someone would suggest that the community
(aka PHP-FIG)
should come up with such an interface, but I think we shouldn't expect
someone else to do the work when *we* are in the position to do it the
best, as those interfaces
should be internal ones, since the built-in URI classes should also
implement them.

If we had these interfaces, projects could use whatever abstraction they
want via composition, but they could more conveniently pass around the same
object everywhere.

I intentionally don't try to draw a conclusion for now, first of all
because it already took me a lot of time to try to mostly objectively
compare the different possibilities, and
I hope that we can find more pros-cons (or fix my reasonings if I made
mistakes somewhere) in order to finally reach some kind of consensus.


> Similarly, I don't understand why the WhatWgError is not final.
> Even if subclassing of the Uri classes is allowed, any error it would have
> would not be a WhatWg one,
> so why should you be able to extend it.
>

I made it final now.

Thank you for your comments:
Máté

Reply via email to