Hi Daniel, Daniel Hartwig <mand...@gmail.com> writes: > * Terminology > > The terminology used in latest URI spec. (RFC 3986) is not widely used > elsewhere. Not by Guile, not by the HTTP spec., or other sources. > Specifically, it defines these terms: > > - URI: scheme rest ... [fragment] > - Absolute-URI: scheme rest ... [fragment] > - Relative-Ref: rest ... [fragment] > - URI-Reference: Absolute-URI | Relative-Ref > > where as HTTP and other sources use the terms from an earlier URI > spec. (RFC 2396): > > - Absolute-URI: scheme rest ... [fragment] > - Relative-URI: rest ... [fragment] > - URI, URI-Reference: Absolute-URI | Relative-URI > > With this patch I have opted to use the later, more common terms. > This has the advantage of not requiring massive renaming or > duplicating of most procedures to include, e.g. > ‘uri-reference-scheme’, as we just use the term ‘uri’ to refer to > either type. > > If this is undesired, it can easily be reworked to use the terminology > from RFC 3986.
Thanks for your careful work on this, and especially for calling our attention to the terminology changes introduced in the latest URI spec. My preference would be to use the newer RFC 3986 terms. To my mind, the key question is: which type (Absolute-URI or URI-Reference) is more commonly appropriate in user code, and thus more deserving of the short term "URI". I would argue that Absolute-URIs are more often appropriate in typical user code. The reason is that outside of URI-handling libraries, most code that deals with URIs simply use them as universal pointers, i.e. they implicitly assume that each URI is sufficient by itself to identify any resource in universe. Working with URI-References is inherently trickier and more error-prone, because code that handles them must do some additional bookkeeping to associate each URI-Reference with its _context_. It is inconvenient to mix URI-References from different contexts, and they must be converted when moved from one context to another. For typical code, the simplest and safest strategy for dealing with URI-References is to convert them to Absolute-URIs as early as possible, preferably as the document is being read. (Of course, there are special cases such as editors where it is important to preserve the URI-References, but that is not the typical case). Therefore, I think that Absolute-URI is more deserving of the short term "URI", and furthermore that existing code outside of (web uri) that refers to URIs should, by default, be assumed to be talking about Absolute-URIs. Only after some thought about whether a procedure handles relative references properly should its type be changed to accept URI-References. > * API compatability > > Presently, all APIs work only with absolute URIs. You can not use > string->uri or build-uri to produce any relative URIs, neither are > other procedures (generally) expected to work correctly if given them. For the reasons given above, I think that it is a virtue, not a flaw, i.e. I think that latest URI spec (RFC 3986) got this right. It is important to clearly distinguish Absolute-URIs from URI-References. Despite their overlapping syntax, they are very different concepts, and must not be conflated. Here's what I suggest: instead of extending 'string->uri' and 'build-uri' to produce relative URIs, rename those extended procedures 'string->uri-reference' and 'build-uri-reference'. These are long names, but that's okay because users should think twice before using them, and that's seldom needed. Then, we extend 'string->uri' and 'build-uri' in a different way: we extend them to handle relative references in their *inputs*, but continue to provide absolute *outputs*, by adding an optional keyword argument '#:base-uri'. This would make it easy to implement the simplest and safest strategy outlined above with a minimum of code changes. What do you think? Mark