Hi Gina, 1. > The paragraph in at the beginning of the RFC in the > Relevant URI > specifications > WHATWG URL section seems to be incomplete. >
Hopefully it's good now. Although I know this section doesn't include much information. 2. > I don't really understand how the UninitializedUriException exception can > be thrown? > Is it somehow possible to create an instance of a URI without initializing > it? > This seems unwise in general. > I think I've already answered this since then in my previous email (and in the RFC as well), but yes, it's possible via reflection. I don't really have an idea how this possibility could be avoided without also making the classes final. > > 3. > I'm not really convinced by using the constructor to be able to create a > URI object. > I think it would be better for it to be private/throwing and have two > static constructor `parse` and `tryParse`, > mimicking the API that exists for creating an instance of a backed enum > from a scalar. > I'm not completely against using parse() and tryParse(), but I think the constructor already makes it clear that it either returns a valid object or throws. > > 4. > I think changing the name of the `toString` method to `toRawString` better > matches the rest of the proposed API, > and also removes the question as to why it isn't the magic method > `__toString`. > For RFC 3986, we could go with toString() instead of toNormalizedString() and toRawString() instead of toString() so that we use the same convention as for getters. Recently I learnt that for some reason WHATWG normalizes the IP address *during* component recomposition, so its toString() is not really the most rare (at least not in the same way as "raw getters" are). So for WHATWG, I think keeping toString() and toDisplayString() probably still makes sense. > > 5. > I will echo Tim's concerns about the non-final-ity of the URI classes. > This seems like a recipe for disaster. > I can _maybe_ see the usefulness of extending Rfc3986\Uri by a subclass > Ldap\Uri, > but being able to extend the WhatWg URI makes absolutely no sense. > The point of these classes is that if you have an instance of one of > these, you *know* that you have a valid URI. > Being able to subclass a URI and mess with the `equals`, `toString`, > `toNormalizedString` methods throws away all the safety guarantees provided > by ***possessing*** a Uri instance. > I'm sure that people will find their use-cases to subclass all these new classes, including the WHATWG implementation. As Nicolas mentioned, his main use-case is minly adding convenience and new factory methods that don't specifically need all methods to be reimplemented. While I share your opinion that leaving the URI classes open for extension is somewhat risky and it's difficult to assess its impacts right now, I can also sympathise with what Nicolas wrote in a later message ( https://externals.io/message/123997#126489): that we shouldn't close the door for the public from using interchangeable implementations. I know that going final without any interfaces is the most "convenient" for the PHP project itself, because the solution has much less BC surface to maintain, so we are relatively free and safe to make future changes. This is useful for an API in its early days that is huge like this. Besides the interests of the maintainers, we should also take two important things into account: - Heterogeneous use-cases: it's out of question that the current API won't fit all use-cases, especially because we have already identified some followup tasks that should be implemented (see "Future Scope" section in the RFC). - Interoperability: Since URI handling is a very widespread problem, many people and libraries will start to use the new extension once it's available. But because of the above reason, many of them want to use their own abstraction, and that's exactly why a common ground is needed: there's simply not a single right possible implementation - everyone has their own, given the complexity of the topic. So we should try to be considerate about these factors by some way or another. So far, we have four options: - Making the classes open for extension: this solution has acknowledged technical challenges ( https://github.com/php/php-src/pull/14461#discussion_r1847316607), and it limits our possibilities of adding changes the most, but users can effectively add any behavior that they need. Of course, they are free to introduce bugs and spec-incompatible behavior into their own implementation, but none of the other solutions could prevent such bugs either, since people will write their custom code wherever they can: if they can't have it in a child class, then they will have in MyUri, or in UriHelper, or just in a 200 lines long function. Being able to extend the built-in classes also means that child classes can use the behavior of their parent by default - there's no need to create wrapper classes around the built-in ones (aka using composition), that is a tedious task to implement, and also which would incur some performance penalty because of the extra method calls. - Making the classes open for extension, but making some methods final: same benefits as above, without the said technical challenges - in theory. I am currently trying to figure out if there is a combination of methods that could be made final so that the known challenges become impossible to be triggered - although I haven't managed to come up with a sensible solution yet. - Making the classes final: It avoids some edge-cases for the built-in classes (the uninitialized state most prominently), while it leaves the most room for making future changes. Projects that may want to ship their own abstractions for the two built-in classes can use composition to create their own URI implementations. They can instantiate these implementations however they want to (i.e. $myUri = new MyUri($uri)). If they need to pass an URI to other libraries then they could extract the wrapped built-in class (i.e. $myUri->getUri()). On the flipside, backporting methods added in future PHP versions (aka polyfills) will become impossible to implement for URIs according to my knowledge, as well as mocking in PHPUnit will also be a lost feature (I'm not sure if it's a good or a bad thing, but it may be worth to point out). Also, the current built-in implementations may have alternative implementations that couldn't be used instead of them. For example, the ADA URL library (which is mentioned in the RFC) also implements the WHATWG specification - possibly the very same way as Lexbor, the currently used library - does. These alternative implementations may have different performance characteristics, platform requirements, or level of maintenance/support, which may qualify them as more suitable for some use-cases than what the built-in ones can offer. If we make these classes final, there's no way to use alternative implementations as a replacement for the default ones, although they all implement the same specification having mostly clear semantics. - Making the classes final, but adding a separate interface for each: The impact of making the built-in classes final would be mitigated by adding one interface for each specification (I didn't like this idea in the past, but it now looks much more useful in the perspective of the final vs non-final debate). Because of the interfaces, there would be a common denominator for the different possible implementations. I'm sure that someone would suggest that the community (aka PHP-FIG) should come up with such an interface, but I think we shouldn't expect someone else to do the work when *we* are in the position to do it the best, as those interfaces should be internal ones, since the built-in URI classes should also implement them. If we had these interfaces, projects could use whatever abstraction they want via composition, but they could more conveniently pass around the same object everywhere. I intentionally don't try to draw a conclusion for now, first of all because it already took me a lot of time to try to mostly objectively compare the different possibilities, and I hope that we can find more pros-cons (or fix my reasonings if I made mistakes somewhere) in order to finally reach some kind of consensus. > Similarly, I don't understand why the WhatWgError is not final. > Even if subclassing of the Uri classes is allowed, any error it would have > would not be a WhatWg one, > so why should you be able to extend it. > I made it final now. Thank you for your comments: Máté