Hi Maté, > On Mar 18, 2025, at 15:15, Máté Kocsis <kocsismat...@gmail.com> wrote:
> > There's no way I would have written an implementation from scratch. I'm using > the url module of the Lexbor C library (https://github.com/lexbor/lexbor/) > for handling WHATWG URLs. It's already bundled in core, and it's also battle > tested, and it has exceptional maintenance. I did not mean to imply writing a parser from scratch; my apologies for phrasing it poorly. > All I had to implement is the glue between userland and the C library. That is more what I was getting at. Rowbot has a lot of what looks to be good design work on structures that come out of the parsing, in addition to a separate parser class. The RFC might benefit from an explicit and intentional review of, and maybe incorporation of, some of the pre-existing Rowbot design work. At least one thing from Rowbot is absolutely not applicable to the RFC (e.g. the PSR-3 logging); maybe none of rest of it will be applicable either, but as prior art from someone acknowledged in the WHATWG-URL spec, I think it bears your close attention. As an overview, the following is a brief comparison between Rowbot and the RFC; any missed or misrepresented functionality is unintentional. * * * ## RFC One non-final readonly Url class: - 5 getRaw...() methods, 8 get...() methods, and one get...ForDisplay() method - immutability via 8 with...() methods, broadly expecting properly-encoded arguments, and soft-erroring on invalid characters - a static parse() method, with relative parsing capability and a place to capture errors - equals() to compare two URLs - toString() for machine-friendly string recomoposition - toDisplayString() for human-friendly string recomposition - resolve() to resolve a relative URL using the current URL as the base - serialize/deserialize; "the serialized form only includes the recomposed URI itself exposed as the `__uri` field, but the individual properties or URI components are not present." - no URLSearchParams implementation ## Rowbot (None of the classes are readonly or final; these look to hew closely to the WHATWG-URL spec.) A BasicURLParser class: - affords relative parsing capability and an option parameter for the target URLRecord - returns a URLRecord A URLRecord class: - public mutable properties for the URL components - $scheme is a Scheme implementation with equals() and other is...() methods - $host is a HostInterface (and implementations) with equals() and other is...() methods - $path is a PathInterface (and PathList implementation) with PathSegment manipulation methods - setUsername() and setPassword() mutators - serializing - getOrigin(), includesCredentials(), isEqual() A URL class: - Composed of a URLRecord and a URLSearchParams object - Constructor takes a string, parses it to a URLRecord, and retains the URLRecord - a static parse() method with relative parsing, as a convenience method - __toString() and toString() return the serialized URLRecord - Virtual properties for $href, $origin, $protocol, $username, $password, $host, $hostname, $port, $pathname, $search, $searchParams, $hash - Mutability of virtual properties via magic __set() - Readability of virtual properties via magic __get() A URLSearchParams class: - search params manipulation methods - implements Countable, Iterator, Stringable - composed of a QueryList implementation and (optionally) the originating URLRecord * * * -- pmj