Re: [PATCH] Tweak web modules, support relative URIs
Hi Daniel, Daniel Hartwig writes: > * Terminology > > The terminology used in latest URI spec. (RFC 3986) is not widely used > elsewhere. Not by Guile, not by the HTTP spec., or other sources. > Specifically, it defines these terms: > > - URI: scheme rest ... [fragment] > - Absolute-URI: scheme rest ... [fragment] > - Relative-Ref: rest ... [fragment] > - URI-Reference: Absolute-URI | Relative-Ref > > where as HTTP and other sources use the terms from an earlier URI > spec. (RFC 2396): > > - Absolute-URI: scheme rest ... [fragment] > - Relative-URI: rest ... [fragment] > - URI, URI-Reference: Absolute-URI | Relative-URI > > With this patch I have opted to use the later, more common terms. > This has the advantage of not requiring massive renaming or > duplicating of most procedures to include, e.g. > ‘uri-reference-scheme’, as we just use the term ‘uri’ to refer to > either type. > > If this is undesired, it can easily be reworked to use the terminology > from RFC 3986. Thanks for your careful work on this, and especially for calling our attention to the terminology changes introduced in the latest URI spec. My preference would be to use the newer RFC 3986 terms. To my mind, the key question is: which type (Absolute-URI or URI-Reference) is more commonly appropriate in user code, and thus more deserving of the short term "URI". I would argue that Absolute-URIs are more often appropriate in typical user code. The reason is that outside of URI-handling libraries, most code that deals with URIs simply use them as universal pointers, i.e. they implicitly assume that each URI is sufficient by itself to identify any resource in universe. Working with URI-References is inherently trickier and more error-prone, because code that handles them must do some additional bookkeeping to associate each URI-Reference with its _context_. It is inconvenient to mix URI-References from different contexts, and they must be converted when moved from one context to another. For typical code, the simplest and safest strategy for dealing with URI-References is to convert them to Absolute-URIs as early as possible, preferably as the document is being read. (Of course, there are special cases such as editors where it is important to preserve the URI-References, but that is not the typical case). Therefore, I think that Absolute-URI is more deserving of the short term "URI", and furthermore that existing code outside of (web uri) that refers to URIs should, by default, be assumed to be talking about Absolute-URIs. Only after some thought about whether a procedure handles relative references properly should its type be changed to accept URI-References. > * API compatability > > Presently, all APIs work only with absolute URIs. You can not use > string->uri or build-uri to produce any relative URIs, neither are > other procedures (generally) expected to work correctly if given them. For the reasons given above, I think that it is a virtue, not a flaw, i.e. I think that latest URI spec (RFC 3986) got this right. It is important to clearly distinguish Absolute-URIs from URI-References. Despite their overlapping syntax, they are very different concepts, and must not be conflated. Here's what I suggest: instead of extending 'string->uri' and 'build-uri' to produce relative URIs, rename those extended procedures 'string->uri-reference' and 'build-uri-reference'. These are long names, but that's okay because users should think twice before using them, and that's seldom needed. Then, we extend 'string->uri' and 'build-uri' in a different way: we extend them to handle relative references in their *inputs*, but continue to provide absolute *outputs*, by adding an optional keyword argument '#:base-uri'. This would make it easy to implement the simplest and safest strategy outlined above with a minimum of code changes. What do you think? Mark
always O_BINARY?
Hi, Just thinking aloud here -- Windows has this O_BINARY thing that translates CRLF to LF when reading, and LF to CRLF when writing. It seems to me to be a useless thing. We already have our own i/o abstractions and should deal with CRLF vs LF in Scheme, I think: The (newline) function can write CRLF The ~% format directive should DTRT read-line should DTRT And since all of our hackers have been on POSIX systems, we're used to there being no O_BINARY/O_TEXT distinction. So, what do you think about always adding O_BINARY to files that Guile opens? Regards, Andy -- http://wingolog.org/
Re: always O_BINARY?
Andy Wingo writes: > Hi, > > Just thinking aloud here -- Windows has this O_BINARY thing that > translates CRLF to LF when reading, and LF to CRLF when writing. It > seems to me to be a useless thing. We already have our own i/o > abstractions and should deal with CRLF vs LF in Scheme, I think: > > The (newline) function can write CRLF > The ~% format directive should DTRT > read-line should DTRT > > And since all of our hackers have been on POSIX systems, we're used to > there being no O_BINARY/O_TEXT distinction. > > So, what do you think about always adding O_BINARY to files that Guile > opens? Maybe look at what Emacs on Windows does? I would guess it has the same question, and probably the same answer as you've suggested. Neil
Re: [PATCH] Tweak web modules, support relative URIs
On 24 February 2013 18:45, Mark H Weaver wrote: > Hi Daniel, > > Daniel Hartwig writes: >> * Terminology >> >> The terminology used in latest URI spec. (RFC 3986) is not widely used >> elsewhere. Not by Guile, not by the HTTP spec., or other sources. >> Specifically, it defines these terms: >> >> - URI: scheme rest ... [fragment] >> - Absolute-URI: scheme rest ... [fragment] >> - Relative-Ref: rest ... [fragment] >> - URI-Reference: Absolute-URI | Relative-Ref >> >> where as HTTP and other sources use the terms from an earlier URI >> spec. (RFC 2396): >> >> - Absolute-URI: scheme rest ... [fragment] >> - Relative-URI: rest ... [fragment] >> - URI, URI-Reference: Absolute-URI | Relative-URI >> > My preference would be to use the newer RFC 3986 terms. To my mind, the > key question is: which type (Absolute-URI or URI-Reference) is more > commonly appropriate in user code, and thus more deserving of the short > term "URI". > > I would argue that Absolute-URIs are more often appropriate in typical > user code. The reason is that outside of URI-handling libraries, most > code that deals with URIs simply use them as universal pointers, > i.e. they implicitly assume that each URI is sufficient by itself to > identify any resource in universe. Right. RFC 3986 makes a convincing argument for the new terminology. These notes about usage also reflect the sentiment in that document. FWIW, I sat mostly on the fence, finally going away from URI-Reference due to these concerns I expressed in an earlier email: > The API seems less clean, and it is not immediately clear > that uri? is not the top of the URI-like type hierarchy. The other > functions only indicate “uri” in their name. I did not > wish to introduce parallel “build-uri-reference”, etc. for each of > these, and did consider adding #:reference? on some to select > weaker validation. and looking at some other Scheme URI modules. However, having read over your comments I think that we could reasonably get away with just introducing the procedures you mention below and not bother about renaming (or duplicating) the field getters to ‘uri-reference-path’ etc.. > Here's what I suggest: instead of extending 'string->uri' and > 'build-uri' to produce relative URIs, rename those extended procedures > 'string->uri-reference' and 'build-uri-reference'. These are long > names, but that's okay because users should think twice before using > them, and that's seldom needed. In your proposed solution, ‘uri?’ and ‘uri-reference?’ are the predicates and they respond according to the RFC rather than internal Guile details? That is: (uri? (string->uri-reference "http://example.net/";)) => #t (uri-reference? (string->uri-reference "http://example.net/";)) => #t (uri? (string->uri-reference "foo")) => #f or …? > Then, we extend 'string->uri' and 'build-uri' in a different way: we > extend them to handle relative references in their *inputs*, but > continue to provide absolute *outputs*, by adding an optional keyword > argument '#:base-uri'. This would make it easy to implement the > simplest and safest strategy outlined above with a minimum of code > changes. This strategy does reflect the recommendation of RFC 3986 to resolve the references as they are read. Also an elegant API, as it encourages immedately resolving uri-references and never creating (or considering to create) the context-sensitive relative-refs. > > What do you think? > I quite like it, particularly the last part about #:base-uri. Ludo, I think this is basically what you were suggesting in the first place? :-) .
Re: always O_BINARY?
> From: Andy Wingo > So, what do you think about always adding O_BINARY to files that Guile > opens? Lilypond, Gnucash, Denemo, Autogen and Emacs all run on Windows to varying degrees. As does Gnome Games. If it doesn't break any of them, then it might be okay. In an ideal world, there would be a cross-platform build bot that runs 'make check' on each of these things so that one could know if a change was going to break something. But, for what it is worth, I think it is a bad idea. If you imagine a program that uses autoconf... One way to deal with the rapid churn of API in Guile is to check for the presence or absence of a function. Most of our API changes could be detected in a configure script by checking to see if a procedure is present or absent. This would be something else entirely. To deal with this in an autoconf sense, one would have to write a test that actually reads and writes a file. -Mike
Re: [PATCH] Tweak web modules, support relative URIs
Daniel Hartwig writes: > On 24 February 2013 18:45, Mark H Weaver wrote: >> I would argue that Absolute-URIs are more often appropriate in typical >> user code. The reason is that outside of URI-handling libraries, most >> code that deals with URIs simply use them as universal pointers, >> i.e. they implicitly assume that each URI is sufficient by itself to >> identify any resource in universe. > > Right. RFC 3986 makes a convincing argument for the new terminology. > These notes about usage also reflect the sentiment in that document. > > FWIW, I sat mostly on the fence, finally going away from URI-Reference > due to these concerns I expressed in an earlier email: >> The API seems less clean, and it is not immediately clear >> that uri? is not the top of the URI-like type hierarchy. The other >> functions only indicate “uri” in their name. I did not >> wish to introduce parallel “build-uri-reference”, etc. for each of >> these, and did consider adding #:reference? on some to select >> weaker validation. > > and looking at some other Scheme URI modules. > > However, having read over your comments I think that we could > reasonably get away with just introducing the procedures you mention > below and not bother about renaming (or duplicating) the field getters > to ‘uri-reference-path’ etc.. Hmm. The cleanest solution would probably be to duplicate the field getters, and make the 'uri-*' variants (e.g. 'uri-path') raise an error when applied to a relative reference. However, it's probably not that important, so if you think it's better to simply extend 'uri-path' etc to apply to all URI-References, I'm okay with that. >> Here's what I suggest: instead of extending 'string->uri' and >> 'build-uri' to produce relative URIs, rename those extended procedures >> 'string->uri-reference' and 'build-uri-reference'. These are long >> names, but that's okay because users should think twice before using >> them, and that's seldom needed. > > In your proposed solution, ‘uri?’ and ‘uri-reference?’ are the > predicates and they respond according to the RFC rather than internal > Guile details? What do you mean by "rather than internal Guile details"? Here's how I like to think about these types: URI-Reference is at the top of the type hierarchy, and URI (a.k.a. Absolute-URI) and Relative-Ref are subtypes. Furthermore, every URI-Reference is either an Absolute-URI or a Relative-Ref. In other words, if you create a URI-Reference that happens to be absolute, then you'll end up with a URI, in the same sense that if you create a complex number whose imaginary part happens to be exact zero, you'll end up with a real number. > That is: > > (uri? (string->uri-reference "http://example.net/";)) > => #t > (uri-reference? (string->uri-reference "http://example.net/";)) > => #t > (uri? (string->uri-reference "foo")) > => #f Yes. >> Then, we extend 'string->uri' and 'build-uri' in a different way: we >> extend them to handle relative references in their *inputs*, but >> continue to provide absolute *outputs*, by adding an optional keyword >> argument '#:base-uri'. This would make it easy to implement the >> simplest and safest strategy outlined above with a minimum of code >> changes. > > This strategy does reflect the recommendation of RFC 3986 to resolve > the references as they are read. Also an elegant API, as it > encourages immedately resolving uri-references and never creating (or > considering to create) the context-sensitive relative-refs. > >> >> What do you think? >> > > I quite like it, particularly the last part about #:base-uri. > > Ludo, I think this is basically what you were suggesting in the first place? > :-) Excellent! BTW, to be clear, I suggest that 'string->uri' and 'build-uri' should be guaranteed to produce Absolute-URIs. In other words, they should raise an error if not given enough information to produce an Absolute-URI. Does that make sense? Thanks again for your work on this :) Mark
Re: Programming racket like in guile
stefan.ita...@gmail.com skribis: > 1. misc small utilities used in the translation process. This can > probably be compartmentized more but it's kind of nice to have one > include file. > > 2. Syntax parse. I used syntax parse to make most of the more advanced > macros in the compability layer. > > 3. Racket For loops, used quite extensively in racket code > > 4. Racket Structs, also used quite a lot in racket code > > 5. Racket lambda utilities, used extensively in contract code > > 6. racket contracts, > > 7. racket match, A nice matcher that even has PEG qualities. OK, I was hoping that things would be somewhat independent, but apparently no. What would have been nice IMO is to import, say, ‘syntax-parse’ and contracts, without having to pull in a whole compatibility layer. Ludo’.
scm_t_subr warnings
Hey Guilers, Andy and Ludo and I were discussing this on IRC and it was suggested that we move the question to the mailing list. I'm trying to compile some code -- using `gcc -pedantic' -- that invokes `scm_c_make_gsubr', and I'm getting the following warning: warning: ISO C forbids passing argument 5 of ‘scm_c_make_gsubr’ between function pointer and ‘void *’ [-pedantic] /usr/local/include/guile/2.0/libguile/gsubr.h:63:13: note: expected ‘scm_t_subr’ but argument is of type ‘struct scm_unused_struct * (*)(struct scm_unused_struct *, struct scm_unused_struct *)’ I was confused, because I was sure that Guile defines scm_t_subr as `SCM (*) ()', meaning that an `scm_t_subr' is of unspecified arity. And I was right, but only at libguile compilation time. From __scm.h: #ifdef BUILDING_LIBGUILE typedef SCM (* scm_t_subr) (); #else typedef void *scm_t_subr; #endif Thus the warning: ISO C lets you cast any kind of pointer to `void *' -- except for a function pointer. Ludovic suggested that this bit of preprocessor magic exists to support C++, in which the `()' style of function prototyping is equivalent to `(void)'. But that leaves people like me who want to be, well, pedantic, in a tough spot. Is there anything we can do about this? One thing I was thinking was that we could support the C++ case (and others) more explicitly. E.g.: #ifdef __cplusplus typedef void *scm_t_subr #else typedef SCM (* scm_t_subr) (); #endif What do you think? Regards, Julian
Re: always O_BINARY?
Hi! Andy Wingo skribis: > Just thinking aloud here -- Windows has this O_BINARY thing that > translates CRLF to LF when reading, and LF to CRLF when writing. It > seems to me to be a useless thing. We already have our own i/o > abstractions and should deal with CRLF vs LF in Scheme, I think: Yes. > The (newline) function can write CRLF > The ~% format directive should DTRT > read-line should DTRT IMO the correct abstraction here is transcoders à la R6RS. The problem is that scm_t_port doesn’t have any slot to specify the EOL style, but it would need one. > So, what do you think about always adding O_BINARY to files that Guile > opens? Yes, but only when there’s a per-port EOL style, since otherwise we’d just remove functionality, no? Ludo’.
Re: Programming racket like in guile
On Sunday, February 24, 2013 10:07:36 PM Ludovic Courtès wrote: > What would have been nice IMO is to import, say, ‘syntax-parse’ and > contracts, without having to pull in a whole compatibility layer. > > Ludo’. I would say that I tried more to make syntax-parse independent. contracts on the other hand depends on syntax-parse for loops racket structs etc. That said syntax parse is quite a hefty sized library. /Stefan
Re: CPS Update
On 23 February 2013 18:49, Mark H Weaver wrote: > William ML Leslie writes: >> Recompiling every procedure that uses + when somebody binds it means >> compiling a lot of code that probably isn't going to be used. More >> likely, if + has been inlined here, the compiler will have to emit a >> guard that checks inlining assumptions as the start of the let body. > > I'm afraid this isn't good enough. Even if one ignores the possibility > of multiple threads, checks would have to be added not just at the start > of each let body, but also upon return from every procedure that might > rebind '+' or capture its continuation. This includes all procedures > accessed through toplevel/module bindings. Not each let body, the let body in the example code. Specifically, a guard needs to be placed whenever code with undetermined effect happens-before a 'call' to an inlined function. That we are talking about happens-before means the possibility of runtime invalidation of code is limited not only by calls to functions of unknown effect, but also by usages of the variable. > Therefore, I repeat my initial assertion that this is a can of worms. Except that most dynamic compilers for imperative languages already do this (because it's a pretty common thing to need to do). -- William Leslie