Re: [PATCH] Tweak web modules, support relative URIs

2013-02-24 Thread Mark H Weaver
Hi Daniel,

Daniel Hartwig  writes:
> * Terminology
>
> The terminology used in latest URI spec. (RFC 3986) is not widely used
> elsewhere.  Not by Guile, not by the HTTP spec., or other sources.
> Specifically, it defines these terms:
>
> - URI: scheme rest ... [fragment]
> - Absolute-URI: scheme rest ... [fragment]
> - Relative-Ref: rest ... [fragment]
> - URI-Reference: Absolute-URI | Relative-Ref
>
> where as HTTP and other sources use the terms from an earlier URI
> spec. (RFC 2396):
>
> - Absolute-URI: scheme rest ... [fragment]
> - Relative-URI: rest ... [fragment]
> - URI, URI-Reference: Absolute-URI | Relative-URI
>
> With this patch I have opted to use the later, more common terms.
> This has the advantage of not requiring massive renaming or
> duplicating of most procedures to include, e.g.
> ‘uri-reference-scheme’, as we just use the term ‘uri’ to refer to
> either type.
>
> If this is undesired, it can easily be reworked to use the terminology
> from RFC 3986.

Thanks for your careful work on this, and especially for calling our
attention to the terminology changes introduced in the latest URI spec.

My preference would be to use the newer RFC 3986 terms.  To my mind, the
key question is: which type (Absolute-URI or URI-Reference) is more
commonly appropriate in user code, and thus more deserving of the short
term "URI".

I would argue that Absolute-URIs are more often appropriate in typical
user code.  The reason is that outside of URI-handling libraries, most
code that deals with URIs simply use them as universal pointers,
i.e. they implicitly assume that each URI is sufficient by itself to
identify any resource in universe.

Working with URI-References is inherently trickier and more error-prone,
because code that handles them must do some additional bookkeeping to
associate each URI-Reference with its _context_.  It is inconvenient to
mix URI-References from different contexts, and they must be converted
when moved from one context to another.

For typical code, the simplest and safest strategy for dealing with
URI-References is to convert them to Absolute-URIs as early as possible,
preferably as the document is being read.  (Of course, there are special
cases such as editors where it is important to preserve the
URI-References, but that is not the typical case).

Therefore, I think that Absolute-URI is more deserving of the short term
"URI", and furthermore that existing code outside of (web uri) that
refers to URIs should, by default, be assumed to be talking about
Absolute-URIs.  Only after some thought about whether a procedure
handles relative references properly should its type be changed to
accept URI-References.

> * API compatability
>
> Presently, all APIs work only with absolute URIs.  You can not use
> string->uri or build-uri to produce any relative URIs, neither are
> other procedures (generally) expected to work correctly if given them.

For the reasons given above, I think that it is a virtue, not a flaw,
i.e. I think that latest URI spec (RFC 3986) got this right.  

It is important to clearly distinguish Absolute-URIs from
URI-References.  Despite their overlapping syntax, they are very
different concepts, and must not be conflated.

Here's what I suggest: instead of extending 'string->uri' and
'build-uri' to produce relative URIs, rename those extended procedures
'string->uri-reference' and 'build-uri-reference'.  These are long
names, but that's okay because users should think twice before using
them, and that's seldom needed.

Then, we extend 'string->uri' and 'build-uri' in a different way: we
extend them to handle relative references in their *inputs*, but
continue to provide absolute *outputs*, by adding an optional keyword
argument '#:base-uri'.  This would make it easy to implement the
simplest and safest strategy outlined above with a minimum of code
changes.

What do you think?

Mark



always O_BINARY?

2013-02-24 Thread Andy Wingo
Hi,

Just thinking aloud here -- Windows has this O_BINARY thing that
translates CRLF to LF when reading, and LF to CRLF when writing.  It
seems to me to be a useless thing.  We already have our own i/o
abstractions and should deal with CRLF vs LF in Scheme, I think:

  The (newline) function can write CRLF
  The ~% format directive should DTRT
  read-line should DTRT

And since all of our hackers have been on POSIX systems, we're used to
there being no O_BINARY/O_TEXT distinction.

So, what do you think about always adding O_BINARY to files that Guile
opens?

Regards,

Andy
-- 
http://wingolog.org/



Re: always O_BINARY?

2013-02-24 Thread Neil Jerram
Andy Wingo  writes:

> Hi,
>
> Just thinking aloud here -- Windows has this O_BINARY thing that
> translates CRLF to LF when reading, and LF to CRLF when writing.  It
> seems to me to be a useless thing.  We already have our own i/o
> abstractions and should deal with CRLF vs LF in Scheme, I think:
>
>   The (newline) function can write CRLF
>   The ~% format directive should DTRT
>   read-line should DTRT
>
> And since all of our hackers have been on POSIX systems, we're used to
> there being no O_BINARY/O_TEXT distinction.
>
> So, what do you think about always adding O_BINARY to files that Guile
> opens?

Maybe look at what Emacs on Windows does?  I would guess it has the same
question, and probably the same answer as you've suggested.

   Neil



Re: [PATCH] Tweak web modules, support relative URIs

2013-02-24 Thread Daniel Hartwig
On 24 February 2013 18:45, Mark H Weaver  wrote:
> Hi Daniel,
>
> Daniel Hartwig  writes:
>> * Terminology
>>
>> The terminology used in latest URI spec. (RFC 3986) is not widely used
>> elsewhere.  Not by Guile, not by the HTTP spec., or other sources.
>> Specifically, it defines these terms:
>>
>> - URI: scheme rest ... [fragment]
>> - Absolute-URI: scheme rest ... [fragment]
>> - Relative-Ref: rest ... [fragment]
>> - URI-Reference: Absolute-URI | Relative-Ref
>>
>> where as HTTP and other sources use the terms from an earlier URI
>> spec. (RFC 2396):
>>
>> - Absolute-URI: scheme rest ... [fragment]
>> - Relative-URI: rest ... [fragment]
>> - URI, URI-Reference: Absolute-URI | Relative-URI
>>

> My preference would be to use the newer RFC 3986 terms.  To my mind, the
> key question is: which type (Absolute-URI or URI-Reference) is more
> commonly appropriate in user code, and thus more deserving of the short
> term "URI".
>
> I would argue that Absolute-URIs are more often appropriate in typical
> user code.  The reason is that outside of URI-handling libraries, most
> code that deals with URIs simply use them as universal pointers,
> i.e. they implicitly assume that each URI is sufficient by itself to
> identify any resource in universe.

Right.  RFC 3986 makes a convincing argument for the new terminology.
These notes about usage also reflect the sentiment in that document.

FWIW, I sat mostly on the fence, finally going away from URI-Reference
due to these concerns I expressed in an earlier email:
> The API seems less clean, and it is not immediately clear
> that uri? is not the top of the URI-like type hierarchy.  The other
> functions only indicate “uri” in their name.  I did not
> wish to introduce parallel “build-uri-reference”, etc. for each of
> these, and did consider adding #:reference? on some to select
> weaker validation.

and looking at some other Scheme URI modules.

However, having read over your comments I think that we could
reasonably get away with just introducing the procedures you mention
below and not bother about renaming (or duplicating) the field getters
to ‘uri-reference-path’ etc..

> Here's what I suggest: instead of extending 'string->uri' and
> 'build-uri' to produce relative URIs, rename those extended procedures
> 'string->uri-reference' and 'build-uri-reference'.  These are long
> names, but that's okay because users should think twice before using
> them, and that's seldom needed.

In your proposed solution, ‘uri?’ and ‘uri-reference?’ are the
predicates and they respond according to the RFC rather than internal
Guile details?  That is:

  (uri? (string->uri-reference "http://example.net/";))
  => #t
  (uri-reference? (string->uri-reference "http://example.net/";))
  => #t
  (uri? (string->uri-reference "foo"))
  => #f

or …?

> Then, we extend 'string->uri' and 'build-uri' in a different way: we
> extend them to handle relative references in their *inputs*, but
> continue to provide absolute *outputs*, by adding an optional keyword
> argument '#:base-uri'.  This would make it easy to implement the
> simplest and safest strategy outlined above with a minimum of code
> changes.

This strategy does reflect the recommendation of RFC 3986 to resolve
the references as they are read.  Also an elegant API, as it
encourages immedately resolving uri-references and never creating (or
considering to create) the context-sensitive relative-refs.

>
> What do you think?
>

I quite like it, particularly the last part about #:base-uri.

Ludo, I think this is basically what you were suggesting in the first place? :-)

.



Re: always O_BINARY?

2013-02-24 Thread Mike Gran
> From: Andy Wingo 
> So, what do you think about always adding O_BINARY to files that Guile
> opens?

Lilypond, Gnucash, Denemo, Autogen and Emacs all run on Windows
to varying degrees.  As does Gnome Games.  If it doesn't break
any of them, then it might be okay.  In an ideal world, there would
be a cross-platform build bot that runs 'make check' on each of
these things so that one could know if a change was going to 
break something.
 
But, for what it is worth, I think it is a bad idea.
 
If you imagine a program that uses autoconf... One way to deal with
the rapid churn of API in Guile is to check for the presence or
absence of a function.  Most of our API changes could be detected
in a configure script by checking to see if a procedure is present
or absent.
 
This would be something else entirely.  To deal with this in an
autoconf sense, one would have to write a test that actually reads
and writes a file.
 
-Mike



Re: [PATCH] Tweak web modules, support relative URIs

2013-02-24 Thread Mark H Weaver
Daniel Hartwig  writes:

> On 24 February 2013 18:45, Mark H Weaver  wrote:
>> I would argue that Absolute-URIs are more often appropriate in typical
>> user code.  The reason is that outside of URI-handling libraries, most
>> code that deals with URIs simply use them as universal pointers,
>> i.e. they implicitly assume that each URI is sufficient by itself to
>> identify any resource in universe.
>
> Right.  RFC 3986 makes a convincing argument for the new terminology.
> These notes about usage also reflect the sentiment in that document.
>
> FWIW, I sat mostly on the fence, finally going away from URI-Reference
> due to these concerns I expressed in an earlier email:
>> The API seems less clean, and it is not immediately clear
>> that uri? is not the top of the URI-like type hierarchy.  The other
>> functions only indicate “uri” in their name.  I did not
>> wish to introduce parallel “build-uri-reference”, etc. for each of
>> these, and did consider adding #:reference? on some to select
>> weaker validation.
>
> and looking at some other Scheme URI modules.
>
> However, having read over your comments I think that we could
> reasonably get away with just introducing the procedures you mention
> below and not bother about renaming (or duplicating) the field getters
> to ‘uri-reference-path’ etc..

Hmm.  The cleanest solution would probably be to duplicate the field
getters, and make the 'uri-*' variants (e.g. 'uri-path') raise an error
when applied to a relative reference.  However, it's probably not that
important, so if you think it's better to simply extend 'uri-path' etc
to apply to all URI-References, I'm okay with that.

>> Here's what I suggest: instead of extending 'string->uri' and
>> 'build-uri' to produce relative URIs, rename those extended procedures
>> 'string->uri-reference' and 'build-uri-reference'.  These are long
>> names, but that's okay because users should think twice before using
>> them, and that's seldom needed.
>
> In your proposed solution, ‘uri?’ and ‘uri-reference?’ are the
> predicates and they respond according to the RFC rather than internal
> Guile details?

What do you mean by "rather than internal Guile details"?

Here's how I like to think about these types: URI-Reference is at the
top of the type hierarchy, and URI (a.k.a. Absolute-URI) and
Relative-Ref are subtypes.  Furthermore, every URI-Reference is either
an Absolute-URI or a Relative-Ref.

In other words, if you create a URI-Reference that happens to be
absolute, then you'll end up with a URI, in the same sense that if you
create a complex number whose imaginary part happens to be exact zero,
you'll end up with a real number.

> That is:
>
>   (uri? (string->uri-reference "http://example.net/";))
>   => #t
>   (uri-reference? (string->uri-reference "http://example.net/";))
>   => #t
>   (uri? (string->uri-reference "foo"))
>   => #f

Yes.

>> Then, we extend 'string->uri' and 'build-uri' in a different way: we
>> extend them to handle relative references in their *inputs*, but
>> continue to provide absolute *outputs*, by adding an optional keyword
>> argument '#:base-uri'.  This would make it easy to implement the
>> simplest and safest strategy outlined above with a minimum of code
>> changes.
>
> This strategy does reflect the recommendation of RFC 3986 to resolve
> the references as they are read.  Also an elegant API, as it
> encourages immedately resolving uri-references and never creating (or
> considering to create) the context-sensitive relative-refs.
>
>>
>> What do you think?
>>
>
> I quite like it, particularly the last part about #:base-uri.
>
> Ludo, I think this is basically what you were suggesting in the first place? 
> :-)

Excellent!  BTW, to be clear, I suggest that 'string->uri' and
'build-uri' should be guaranteed to produce Absolute-URIs.  In other
words, they should raise an error if not given enough information to
produce an Absolute-URI.  Does that make sense?

Thanks again for your work on this :)

 Mark



Re: Programming racket like in guile

2013-02-24 Thread Ludovic Courtès
stefan.ita...@gmail.com skribis:

> 1. misc small utilities used in the translation process. This can
> probably be compartmentized more but it's kind of nice to have one
> include file.
>
> 2. Syntax parse. I used syntax parse to make most of the more advanced
> macros in the compability layer.
>
> 3. Racket For loops, used quite extensively in racket code
>
> 4. Racket Structs, also used quite a lot in racket code
>
> 5. Racket lambda utilities, used extensively in contract code
>
> 6. racket contracts,
>
> 7. racket match, A nice matcher that even has PEG qualities.

OK, I was hoping that things would be somewhat independent, but
apparently no.

What would have been nice IMO is to import, say, ‘syntax-parse’ and
contracts, without having to pull in a whole compatibility layer.

Ludo’.




scm_t_subr warnings

2013-02-24 Thread Julian Graham
Hey Guilers,

Andy and Ludo and I were discussing this on IRC and it was suggested
that we move the question to the mailing list. I'm trying to compile
some code -- using `gcc -pedantic' -- that invokes `scm_c_make_gsubr',
and I'm getting the following warning:

  warning: ISO C forbids passing argument 5 of ‘scm_c_make_gsubr’
between function pointer and ‘void *’ [-pedantic]
  /usr/local/include/guile/2.0/libguile/gsubr.h:63:13: note: expected
‘scm_t_subr’ but argument is of type ‘struct scm_unused_struct *
(*)(struct scm_unused_struct *, struct scm_unused_struct *)’

I was confused, because I was sure that Guile defines scm_t_subr as
`SCM (*) ()', meaning that an `scm_t_subr' is of unspecified arity.
And I was right, but only at libguile compilation time. From __scm.h:

  #ifdef BUILDING_LIBGUILE
  typedef SCM (* scm_t_subr) ();
  #else
  typedef void *scm_t_subr;
  #endif

Thus the warning: ISO C lets you cast any kind of pointer to `void *'
-- except for a function pointer. Ludovic suggested that this bit of
preprocessor magic exists to support C++, in which the `()' style of
function prototyping is equivalent to `(void)'. But that leaves people
like me who want to be, well, pedantic, in a tough spot. Is there
anything we can do about this? One thing I was thinking was that we
could support the C++ case (and others) more explicitly. E.g.:

  #ifdef __cplusplus
  typedef void *scm_t_subr
  #else
  typedef SCM (* scm_t_subr) ();
  #endif

What do you think?


Regards,
Julian



Re: always O_BINARY?

2013-02-24 Thread Ludovic Courtès
Hi!

Andy Wingo  skribis:

> Just thinking aloud here -- Windows has this O_BINARY thing that
> translates CRLF to LF when reading, and LF to CRLF when writing.  It
> seems to me to be a useless thing.  We already have our own i/o
> abstractions and should deal with CRLF vs LF in Scheme, I think:

Yes.

>   The (newline) function can write CRLF
>   The ~% format directive should DTRT
>   read-line should DTRT

IMO the correct abstraction here is transcoders à la R6RS.  The problem
is that scm_t_port doesn’t have any slot to specify the EOL style, but
it would need one.

> So, what do you think about always adding O_BINARY to files that Guile
> opens?

Yes, but only when there’s a per-port EOL style, since otherwise we’d
just remove functionality, no?

Ludo’.




Re: Programming racket like in guile

2013-02-24 Thread Stefan Israelsson Tampe
On Sunday, February 24, 2013 10:07:36 PM Ludovic Courtès wrote:
> What would have been nice IMO is to import, say, ‘syntax-parse’ and
> contracts, without having to pull in a whole compatibility layer.
> 
> Ludo’.

I would say that I tried more to make syntax-parse
independent. contracts on the other hand depends on syntax-parse for
loops racket structs etc. That said syntax parse is quite a hefty
sized library.

/Stefan




Re: CPS Update

2013-02-24 Thread William ML Leslie
On 23 February 2013 18:49, Mark H Weaver  wrote:
> William ML Leslie  writes:
>> Recompiling every procedure that uses + when somebody binds it means
>> compiling a lot of code that probably isn't going to be used.  More
>> likely, if + has been inlined here, the compiler will have to emit a
>> guard that checks inlining assumptions as the start of the let body.
>
> I'm afraid this isn't good enough.  Even if one ignores the possibility
> of multiple threads, checks would have to be added not just at the start
> of each let body, but also upon return from every procedure that might
> rebind '+' or capture its continuation.  This includes all procedures
> accessed through toplevel/module bindings.

Not each let body, the let body in the example code.  Specifically, a
guard needs to be placed whenever code with undetermined effect
happens-before a 'call' to an inlined function.  That we are talking
about happens-before means the possibility of runtime invalidation of
code is limited not only by calls to functions of unknown effect, but
also by usages of the variable.

> Therefore, I repeat my initial assertion that this is a can of worms.

Except that most dynamic compilers for imperative languages already do
this (because it's a pretty common thing to need to do).

--
William Leslie