Re: regexp-split for Guile

2012-10-21 Thread Daniel Hartwig
On 20 October 2012 22:16, Mark H Weaver  wrote:
> Honestly, this question makes me wonder if the proposed 'regexp-split'
> is too complicated.  If you want to trim whitespace, how about using
> 'string-trim-right' or 'string-trim-both' before splitting?  It seems
> more likely to do what I would expect.

Yes.  Keep it simple.  Operations like trim-whitespace and
drop-empty-strings-from-the-result (mentioned in the previous
discussion) are so easy to do outside of regexp-split, why complicate
the semantics?

Limit is arguably more fundamental to the procedure.  Anything else is
pre- or post-processing.



Re: regexp-split for Guile

2012-10-21 Thread Chris K. Jester-Young
On Sat, Oct 20, 2012 at 10:16:49AM -0400, Mark H Weaver wrote:
> Sorry, that last example is wrong of course, but both of these examples
> raise an interesting question about how #:limit and #:trim should
> interact.  To my mind, the top example above is correct.  I think the
> last result should be "baz", not "baz  ".
[...]
> Honestly, this question makes me wonder if the proposed 'regexp-split'
> is too complicated.  If you want to trim whitespace, how about using
> 'string-trim-right' or 'string-trim-both' before splitting?  It seems
> more likely to do what I would expect.

Thanks so much for your feedback, Mark! I appreciate it.

Yeah, I think given the left-to-right nature of regex matching, the
only kind of trimming that makes sense is a right trim. And then once
you do that, people start asking for left trim, and mayhem begins. ;-)

I do want to consider the string pre-trimming approach, as it's more
clear what's going on, and is less "magical" (where "magic" is a plus
in the Perl world, and not so much of a plus in other languages).

Thankfully, the string-trim{,-right,-both} functions you mentioned use
substring behind the scenes, which uses copy-on-write. So that solves
one of my potential concerns, which is that a pre-trim would require
copying most of the string.

*   *   *

Granted, if you want trimming-with-complicated-regex-delimiter, and
not just whitespace, then your best bet is to trim the output list.
This is slightly more complicated, because my original code simply
uses drop-while before reversing the output list for return, but since
the caller doesn't receive the reversed list, they either have to
reverse+trim+reverse (yuck), or we have to implement drop-right-while
(like you mentioned previously).

In that regard, here's one implementation of drop-right-while (that I
just wrote on the spot):

(define (drop-right-while pred lst)
  (let recur ((lst lst))
(if (null? lst) '()
(let ((elem (car lst))
  (next (recur (cdr lst
  (if (and (null? next) (pred elem)) '()
  (cons elem next))

One could theoretically write drop-right-while! also (I can think of
two different implementation strategies) but it sounds like it's more
work than it's worth.

So, that's our last hurdle: we "just" have to get drop-right-while
integrated into Guile, then we can separate out the splitting and
trimming processes. And everybody will be happy. :-)

Comments welcome,
Chris.



Re: regexp-split for Guile

2012-10-21 Thread Chris K. Jester-Young
On Sun, Oct 21, 2012 at 04:20:09PM +0800, Daniel Hartwig wrote:
> Yes.  Keep it simple.  Operations like trim-whitespace and
> drop-empty-strings-from-the-result (mentioned in the previous
> discussion) are so easy to do outside of regexp-split, why complicate
> the semantics?

"So easy", but so verbose. We should prefer to make common use cases
easy (and succinct) to use, and not optimise for uncommon ones.

Anyway, in my response to Mark, I mentioned that if we can get
drop-right-while in-tree, we have a middle ground that should make
"everyone" happy. I am against requiring users of regexp-split to
reinvent that wheel each time. Leave the reinvention to Phil Bewig.[1]

Cheers,
Chris.

[1] http://lists.nongnu.org/archive/html/chicken-users/2009-05/msg00024.html
"I never use SRFI-1. My personal standard library has take, drop,
 take-while, drop-while, range, iterate, filter, zip, [...]"



Re: Needed: per-port reader options

2012-10-21 Thread Ludovic Courtès
Hi,

Mark H Weaver  skribis:

> Section 2.1 of the R7RS (draft 6) explicitly says "The #!fold-case
> directive causes the read procedure to case-fold [...] each identifier
> and character name subsequently read from the same port."

OK, this is more precise than SRFI-105, and definitely per-port (the
semantics are not quite to my taste, but well...)

Then the weak hash table mapping ports to reader options seems like the
“right” approach, to implement these semantics.

Ludo’.



Fixing the slib mess

2012-10-21 Thread Mikael Djurfeldt
Dear Guile hackers,

What nice work you are doing!

For those who don't know me, I'm a Guile developer who has been doing
other stuff for some time.

When trying to use guile 2 for logic programming I discovered that the
slib interface is again broken (and has been for quite some time).
This easily happens because it is a very fragile interface.  The way
this is supposed to be used (and as documented in the manual), one
does a

  (use-modules (ice-9 slib))

and can then do

  (require 'modular)

etc.

The module (ice-9 slib) forms a kind of sandbox so that all new
definitions that are imported through "require" are loaded as local
bindings in the (ice-9 slib) module and are exported through the
public interface of (ice-9 slib).

The implementation of the interface has two sides.  One, the file
ice-9/slib.scm, is owned by Guile.  The other, slib/guile.init, is
owned by slib.  slib has such .init files for some common scheme
implementations but I early on noticed that that the guile.init file
is not really maintained.  I decided that it would be more robust if
slib.scm incorporated most of the interface so that it would be easy
to update it as Guile changed.  But of course slib also changed and at
some point others felt that guile.init should contain most of the
interface and the bulk of slib.scm was moved there.  As we have seen,
this didn't make things much better.

I'll let you ponder on how to handle the fundamental problems with
this interface, but, as a Guile user, I think it would be nice if the
interface works as written in the manual.  Attached to this email
you'll find two patches.  The patch to slib.scm copies a snippet of
code from guile.init so that they agree with eachother and with the
Guile reference manual on how to find slib in the filesystem.  This
patch for example makes SCHEME_LIBRARY_PATH work as described.

I've tried to write the patch to guile.init so that it can play well
with older Guile versions, but we should test this.  In order to make
it work with Guile 2, though, I had to introduce a new syntax binding
syntax-toplevel?.  Given a syntax object (as available within a
syntax-case transformer), it decides if the object originates from top
level context.  It is used, as in the old memoizing macro transformer,
to choose whether to define-public or just define.

*But*, the proper implementation of syntax-toplevel? requires
modification of psyntax.scm and adding it to the (system syntax)
module.  I didn't want to do this until I've had your comments, so the
present patch has its own syntax-object accessors (which breaks
abstraction and is therefore not a real solution).  I should also say
that I have not yet fixed the slib interface to the new Guile uniform
arrays, so there's a lot of slib functionality which won't yet work.

Comments?  Can I add syntax-toplevel? to psyntax.scm and (system
syntax)?  Do you think it is reasonable to submit something along the
line of guile.init.diff to slib guile.init?

Best regards,
Mikael Djurfeldt


slib.scm.diff
Description: Binary data


guile.init.diff
Description: Binary data