On Mon, Jul 15, 2024, at 9:20 AM, Máté Kocsis wrote: > Hey Ignace, Nicolas, > > Based on your request for adding support for RFC 3986 spec compatible > parsing, > I evaluated another library (https://github.com/uriparser/uriparser/) > in the recent days > in order to add support for the requested functionality. As far as I > can tell, the results > were very promising, so I'm ok to include this into my proposal (I > haven't pushed my > changes yet and haven't updated the RFC yet). > > Regarding the reference resolution > (https://uriparser.github.io/doc/api/latest/#resolution) > feature which has also already been asked for, I'm genuinely wondering > what the use-case is? > But in any case, I'm fine with incorporating this as well into the RFC, > since apparently > both Lexbor and uriparser support this (naturally). > > What I became puzzled about is the correct object structure and naming. > Now that uriparser > which can deal with URIs came into the picture, while Lexbor can parse > URLs, I don't > know if it's a good idea to have a dedicated URI and a URL class > extending the former one... > If it is, then in my opinion, the logical behavior would be that Lexbor > always instantiates URL > classes, while uriparser would have to decide if the passed-in URI is > actually an URL, and > choose the instantiated class based on this factor... But in this case > the differences between > the RFC 3986 and WHATWG specifications couldn't be spelled out, since > URL objects > could hold URLs parsed based on both specs (and therefore having a > unified interface is required). > > Or rather we should have a separate URI and a WhatwgUrl class so that > the former one would > always be created by uriparser, while the latter one by Lexbor? This > way we could have a dedicated > object interface for both standards (e.g. the RFC 3986 related one > could have a getUserInfo() method, > while the WHATWG related one could have both getUser() and > getPassword() methods). But then > the question is how interchangeable these classes should be? I.e. > should we be able to convert them > back and forth, or should there be an interface that is implemented by > the two classes? > > I'd appreciate any suggestions regarding these questions. > > P.S. due to its bad receptance, I got rid of the UrlParser class as > well as the UrlComponent enum from my > implementation in the meantime. > > Regards, > Máté
I apologize if I missed this up-thread somewhere, but what precisely are the differences between URI and URL? My understanding was that URL is a subset of URI (all URLs are URIs, but not all URIs are URLs). You're saying they're slightly disjoint sets? Can you give some concrete examples of where the parsing rules would produce different results? That may give us a better sense of what the logic should be. --Larry Garfield