Re: [Denemo-devel] Problem with wide characters on upgrading to guile 2.x

2013-09-04 Thread Richard Shann
On Tue, 2013-09-03 at 19:53 +0200, Thien-Thi Nguyen wrote:
[...]
>pt_BR.utf8 is not a supported locale on your system.

yes, that is why I chose it, so as to trigger the error. It is graceful
handling of this error that I was after - it should default to
untranslated, not abort the program.

> 
> Could the spelling be part of the problem? 

I strongly suspect this stuff is case-insensitive.

The problem surely lies in one of the calls in the line

 scm_setlocale( scm_variable_ref(scm_c_lookup("LC_ALL")), 
scm_from_locale_string("") );

I'll dig down into that when I understand it better.

Thank you for the response,

Richard Shann





Re: [Denemo-devel] Problem with wide characters on upgrading to guile 2.x

2013-09-04 Thread Mark H Weaver
Richard Shann  writes:

> On Tue, 2013-09-03 at 19:53 +0200, Thien-Thi Nguyen wrote:
> [...]
>>pt_BR.utf8 is not a supported locale on your system.
>
> yes, that is why I chose it, so as to trigger the error. It is graceful
> handling of this error that I was after - it should default to
> untranslated, not abort the program.

I disagree.  If Guile does not know what encoding to use, that's a
serious error that should not be ignored, at least not by default.
Please keep in mind that Guile is used for many diverse tasks, including
scripts that are run when the user's not looking, e.g. cron jobs.  Guile
must not silently corrupt data, which could easily happen if it silently
ignores the error without knowing what character encoding to use.

For fully interactive programs, ignoring the error and defaulting to the
C locale (e.g. ASCII) is more reasonable, because any corruptions will
hopefully be noticed by the user.  To do that, please consider the code
that Ludovic suggested:

  (catch 'system-error
  (lambda ()
(setlocale LC_ALL ""))
  (lambda args
(format (current-error-port)
"warning: failed to install locale: ~a~%"
(strerror (system-error-errno args

However, it should be noted that if you do this, Guile 2 (unlike 1.8)
will still raise an error later if you try to read a byte outside of the
ASCII range, or if you try to write a non-ASCII character.

Note that Guile 1.8 conflated characters and bytes, like many older
non-i18n programs, and thus effectively chose Latin-1 by default.
However, since most modern GNU systems use UTF-8 by default (at least
outside of the CJK world) this guess is most likely wrong, and thus
likely to silently corrupt non-ASCII characters.

 Regards,
   Mark



Re: [Denemo-devel] Problem with wide characters on upgrading to guile 2.x

2013-09-04 Thread Richard Shann
On Wed, 2013-09-04 at 12:17 -0400, Mark H Weaver wrote:
> Richard Shann  writes:
> 
> > On Tue, 2013-09-03 at 19:53 +0200, Thien-Thi Nguyen wrote:
> > [...]
> >>pt_BR.utf8 is not a supported locale on your system.
> >
> > yes, that is why I chose it, so as to trigger the error. It is graceful
> > handling of this error that I was after - it should default to
> > untranslated, not abort the program.
> 
> I disagree.  If Guile does not know what encoding to use, that's a
> serious error that should not be ignored, at least not by default.
> Please keep in mind that Guile is used for many diverse tasks, including
> scripts that are run when the user's not looking, e.g. cron jobs.  Guile
> must not silently corrupt data, which could easily happen if it silently
> ignores the error without knowing what character encoding to use.

Ah, I see.

> 
> For fully interactive programs, ignoring the error and defaulting to the
> C locale (e.g. ASCII) is more reasonable, because any corruptions will
> hopefully be noticed by the user.  To do that, please consider the code
> that Ludovic suggested:
> 
>   (catch 'system-error
>   (lambda ()
> (setlocale LC_ALL ""))
>   (lambda args
> (format (current-error-port)
> "warning: failed to install locale: ~a~%"
> (strerror (system-error-errno args
> 
> However, it should be noted that if you do this, Guile 2 (unlike 1.8)
> will still raise an error later if you try to read a byte outside of the
> ASCII range, or if you try to write a non-ASCII character.

Hmm, this all started with guile barfing on utf8 characters in the
scripts it gets from the Denemo program. I'm afraid I've messed up the
title of the thread by a misused "Group Reply", but the setlocale is
being called from C to as a fix for that problem (barfing on utf8):

scm_setlocale( scm_variable_ref(scm_c_lookup("LC_ALL")), 
scm_from_locale_string("") );
 
as suggested by Mike Gran. This works when I use the program with a
(utf8) locale that I have installed, that is scripts with embedded utf8
characters work again. But guile aborts with the erroneous locale set.
Perhaps this is not so important after all.

Thanks

Richard





schemishes sed

2013-09-04 Thread Stefan Israelsson Tampe
Hi all, as I told you in an earlier email, I've been poking with a
grep and sed tool that knows about scheme. Now to see where I'm
heading just consider the following streamed output.

(define (f)
  (format #t
"

(let ((x (+ 1 a))
  (y 2)
  (z 3) 
  (w 4))
  (do-someting x y z w))


"))

The task is to write a program that change the let to tel and swaps (y
2) to (2 y) and keeping the whitespace reasonable sane. Now we can
take on this task by first defining match classes,

(define-match-class swap
  (pattern #(l (x y) r) 
 #:with tr #'#(l (x.l y.it x.r y.l x.it y.r) r)))

(define-match-class (tr-it oldval newval)
  (pattern #(l ,oldval r)
 #:with tr #`#(l #,(datum->syntax #'a newval) r)))

This is similar to syntax-parse define-syntax-class but we allow for
ice-9 match semantics with ~and, ~or, ~not, _ ... 'x x ` as usual, there
is one extra form (~var x (class a ...)) or (~var x class), which will
let x match a syntax class class, just as in syntax-parse. the ,
match symbol (unquote) will be matched to a variable from the outside
context of pattern. Also a variable x will match a token including 
whitespaces (whitespaces are greedily matched and can include comments
#; is treated like a token in itself and we will work on it just as
with normal scheme. one can then use x.l x.r x.it as x.l beeing ws to
the left, x.r ws to the right and x.it the actual token. Also in the 
incomming stream each token is bound to a vector with #(l it r) and
one can use it directly to match ws when e.g. the it is a constant and
not a variable. It is possible to use (~and x 3.14) as well in the
matcher. When we assemble the result that should be inserted to the
stream one does not need to again use vectors, but vector will work as
can be seen in the tr-it class. 

So swap will swap x and y preserving whitespace. tr-it will translate
an oldval to newval.

Now to actually do the transorm we can do it by issuing,



(define (test)
   (par-sed (scm-sed (#(l ((~var let (tr-it 'let 'tel))
 #(a ((~var bind swap) ...) b)
 body ...) r)
 #'#(l (let.tr #(a (bind.tr ...) b) body ...) r)))
  (f)))

And get

scheme@(guile-user)> (test)



(tel (((+ 1 a) x)
  (2 y)
  (3 z) 
  (4 w))
  (do-someting x y z w))


Nice! To note here is that what remains is to bind a Self procedure to
be able to do recursive translations of y and body ... . That's on the
current todo. Also note how we made the evaluation composable e.g.

  scm-sed producer a matcher that if match printd the result else
  fails 
  par-sed take a matcher, std-output generating function and
  perhaps a few flags and then 

This allows one to reuse scm-sed as an argument to a grepper when we
only want to see the matched results e.g.

(par-grep (s-seq (scm-sed (pat c ...) ...) print-nl) (f))

This will actually output the old and the new matched string.

So the tools are quite an interesting combination of syntax-parse and
ice-9 match, it is quite fast because it will only translate and
create objects when there is a matche so it works by actually use
a matcher of the form,

  (s-and silent-match
 (s-seq capture-sexp do-the-reanslation))

As you see the silent match does almost no consing appart from closure
creations and should be lightweight. Also the sielent matcher is using
a backtracker tuned to not not explode on you so should be quite
ok. It does enough cut's to not blow the stack or memory and any
prolog variables are reclaimed properly e.g. it should be able to
handle large files if no bugs remains in this respect. the cpaturing
sexp is using syntax-parse which can be seen of the outputted code for
the matcher e.g. 

(lambda (a b cc)
  (let ((m (f-or! (s-parens
(f-seq (tr-it-match 'let 'tel)
   (s-parens (f-seq (f* swap-match)))
   (f* (sexp))
(l (
 (c)
 (.. (c) ((sexp! a b cc) c))
 (
   (sed-print
 ((lambda (x)
(syntax-parse
  x
  (#(l
 ((~var let (tr-it-class 'let 'tel))
  #(a ((~var bind swap-class) ...) b)
  (~var body Sexp)
  ...)
 r)
   (syntax
 #(l (let.tr #(a (bind.tr ...) b) body ...) 
r)
  c)))
 ( 'ok
(f-and m l)))

Hence it is possible to add extra checks to restrict the match further
than the silent matcher. Also on the list is to add possibilities to
stop the sed process and actually interact with the current sielent
match e.g. one might want to change the output matcher, see if it
matches one might want to edit the outputed code for whitespaces or
simply check to see why it fails be getting a traced output. anything
is possible a

Re: smob gc protection, and inheritance

2013-09-04 Thread Ludovic Courtès
Hi Doug,

Doug Evans  skribis:

> I have a few questions about smobs:

I’m assuming Guile 2.x here.

> 1) Suppose I have some C code that creates a smob and its containing
> SCM, but does not always expose the SCM to Scheme.
>
> E.g.
>
> struct foo_object
> {
>   int bar;
>   SCM baz;
> }
>
> static SCM
> make_foo_smob (void)
> {
>   struct foo_object *foo_smob = (struct foo_object *)
> scm_gc_malloc (sizeof (struct foo_object), "foo");
>   SCM foo_scm;
>
>   foo_smob->bar = -1;
>   foo_smob->baz = SCM_BOOL_F;
>
>   foo_scm = scm_new_smob (foo_smob_tag, (scm_t_bits) foo_smob);
>
>   return foo_scm;  
> }
>
> If the caller stores foo_smob in the heap somewhere, and not foo_scm,
> is that enough to prevent the object from being garbage collected?

Yes, because the region returned by ‘scm_gc_malloc’ is scanned by the GC.

> 2) Is it possible to inherit, e.g., with goops, a smob?
> IOW, can I extend a smob through inheritance?
> Or must I store the smob in a class, and provide accessors?
> [kinda like the "is a" vs "has a" relationship]

Presumably, at least to some extent:

--8<---cut here---start->8---
scheme@(guile-user)> (use-modules (gnutls))
scheme@(guile-user)> (make-session connection-end/client)
$1 = #
scheme@(guile-user)> (use-modules (oop goops))
scheme@(guile-user)> (class-of $1)
$2 = #<  2e26000>
scheme@(guile-user)> (define-class  ())
scheme@(guile-user)> (change-class $1 )
ERROR: In procedure scm-error:
ERROR: No applicable method for #< change-class (1)> in call 
(change-class # #<  2f33000>)

Entering a new prompt.  Type `,bt' for a backtrace or `,q' to continue.
--8<---cut here---end--->8---

Andy can say more, I think.  :-)

> 3) The docs aren't as clear as they could be on whether the "smob"
> free function needs to scm_gc_free all results of calls to scm_gc_malloc
> made when constructing the smob.  IIUC, this is not necessary.

The ‘scm_gc_free’ function doesn’t need to be called nowadays, because
the GC automatically frees ‘scm_gc_malloc’ regions when they are no
longer referenced.

So chances are you don’t even need a SMOB ‘free’ function.

> However, why does the image example do this?

Indeed, the ‘mark’ and ‘free’ functions in that example could be removed
altogether, since the only resources associated with the SMOB is memory
returned by ‘scm_gc_malloc’.

Thanks,
Ludo’.