Re: build failures of master on netbsd/5 amd64

2009-08-31 Thread Ludovic Courtès
Mike Gran  writes:

> On Fri, 2009-08-21 at 14:27 -0400, Greg Troxel wrote:
>> I ran 'gmake -k' to get all the warnings at once.  I think there are
>> just two groups:
>
>> deprecated.c:1092: warning: dereferencing type-punned pointer will break 
>> strict-aliasing rules
>
>> vm-i-scheme.c:437: warning: comparison is always true due to limited range 
>> of data type
>
> I fixed the first one.  Someone more familiar with VM will have to fix
> the second. 

I never hit this.  Has it been fixed for those who did?

Thanks,
Ludo'.





Re: towards a more unified procedure application mechanism

2009-08-31 Thread Ludovic Courtès
Hi!

Andy Wingo  writes:

[...]

>>  scm_tcs_closures   Interpreted procedures
>
> I am hoping to devise a way never to see these in the VM, on the
> eval-cleanup branch.

Cool!

>>  scm_tc7_subr_2oSCM (*) () -- 0 arguments.
>>  scm_tc7_subr_1 SCM (*) (SCM) -- 1 argument.
>>  scm_tc7_subr_1oSCM (*) (SCM) -- 1 optional argument.
>>  scm_tc7_subr_2oSCM (*) (SCM, SCM) -- 2 required args.
>>  scm_tc7_subr_2oSCM (*) (SCM, SCM) -- 2 optional args.
>>  scm_tc7_subr_3 SCM (*) (SCM, SCM, SCM) -- 3 required args.
>>  scm_tc7_lsubr  SCM (*) (SCM) -- list subrs
>>  scm_tc7_lsubr_2SCM (*) (SCM, SCM, SCM)
>
> I would like to make these all be gsubrs. There are very few places
> where these constants actually exist in code "out there" -- normally the
> way to do this is to use scm_c_define_gsubr, and it does the right
> thing.
>
> I'll probably do a:
>
> #define scm_tc7_subr_2o \
>   scm_tc7_subr_2o_NO_LONGER_EXISTS_USE_scm_c_define_gsubr
>
> or something like that.

You can't do that because such subrs are created by `create_gsubr ()'
when the signature allows it.  Or did you mean `create_gsubr ()' would
now create only gsubrs?

These specialized subr types allow for faster dispatch, as opposed to
the argument count checks (and argument copies) that are done in
`scm_i_gsubr_apply ()'.  Thus, I think replacing all of them with gsubrs
may have a negative impact performance-wise.

>>  scm_tc7_dsubr  double (*) (double) -- double subrs
>
> I'll remove these, changing their implementations to be gsubrs. This
> only affects $sin et al; I'll probably roll the transcendental versions
> into the subrs as well.

Yes, it seems to be seldom used.  However, it might be a semi-public
API...

>>  scm_tc7_cxrc[da]+r
>
> I'll change these to be subrs.

OK (this was essentially an interpreter hack AIUI).

>>  scm_tc7_asubr  SCM (*) (SCM, SCM) -- "accumulating" subrs.
>
> These are interesting. We have to keep the C signature of e.g. scm_sum,
> otherwise many things would break. So I'd change scm_sum to be a gsubr
> with two optional arguments, and then on the Scheme level do something
> like:
>
>  (define +
>(let (($+ +))
>  (lambda args
>(cond ((null? args) ($+))
>  ((null? (cdr args)) ($+ (car args)))
>  ((null? (cddr args)) ($+ (car args) (cadr args)))
>  (else (apply + ($+ (car args) (cadr args)) (cddr args)))
>
> The VM already compiles (+ a b c) to (add (add a b) c), where add is a
> primitive binary instruction.

OK.

>>  scm_tc7_rpsubr SCM (*) (SCM, SCM) -- predicate subrs.
>
> Likewise, we'll have to do something like the + case.
>
>>  scm_tc7_smob   Applicable smobs
>
> Well... we probably have to support these also as a primitive procedure
> type. It could be we rework smobs in terms of structs, and if that
> happens, we can use applicable structs -- but barring that, this would
> be a fourth procedure type.

Yes.

>>  scm_tc7_gsubr  Generic subrs
>
> Actually applying gsubrs can be complicated, due to optional args, rest
> args, and the lack of `apply' in C. I guess we should farm out
> application to a scm_i_gsubr_applyv trampoline that takes its args as an
> on-the-stack vector.

Indeed.

>>  scm_tc7_pwsProcedures with setters
>>  Gets the procedure, and applies that. This needs to be inlined into
>>  the VM to preserve tail recursion.
>
> Ideally these would be implemented as applicable structs. We'll see.
>
>>  scm_tcs_struct Applicable structs
>
> Check if the struct is applicable, and if so apply its effective method.

Sounds like a nice plan!

Thanks,
Ludo'.





Re: Problems with LOAD and latest build

2009-08-31 Thread Ludovic Courtès
Hi,

Andy Wingo  writes:

> Now, does this indicate a bug in Guile, or at least an undesirable
> behavior?

Yes, I think so.

Programs that want to rely on bare R5RS (e.g., SILex) have nothing else
but `load' to have code in separate files.  So I think the compiler
should special-case top-level `load', `primitive-load', etc., calls.

What do you think?

Thanks,
Ludo'.





Minor queries about Unicode char docs

2009-08-31 Thread Neil Jerram
First of all, thanks for making these docs (specifically, commit
3f12aed) so clear.  They seem so much clearer and simpler to me than
the months of back-and-forth discussion on r6rs-discuss.  I know those
things are not really comparable, but I hope you can see what I mean.

Then, a couple of queries.

 SCM_DEFINE1 (scm_char_less_p, "char

Re: build failures of master on netbsd/5 amd64

2009-08-31 Thread Greg Troxel

The end of my build, from a few hours ago:

/bin/ksh ../libtool  --tag=CC   --mode=compile gcc -DHAVE_CONFIG_H   
-DBUILDING_LIBGUILE=1 -I.. -I.. -I../lib -I../lib -I/usr/pkg/include 
-I/usr/y0/include -I/usr/pkg/include -I/usr/y0/include -pthread -Wall 
-Wmissing-prototypes -Werror -fvisibility=hidden -g -O2 -MT libguile_la-vm.lo 
-MD -MP -MF .deps/libguile_la-vm.Tpo -c -o libguile_la-vm.lo `test -f 'vm.c' || 
echo './'`vm.c
 gcc -DHAVE_CONFIG_H -DBUILDING_LIBGUILE=1 -I.. -I.. -I../lib -I../lib 
-I/usr/pkg/include -I/usr/y0/include -I/usr/pkg/include -I/usr/y0/include 
-pthread -Wall -Wmissing-prototypes -Werror -fvisibility=hidden -g -O2 -MT 
libguile_la-vm.lo -MD -MP -MF .deps/libguile_la-vm.Tpo -c vm.c  -fPIC -DPIC -o 
.libs/libguile_la-vm.o
cc1: warnings being treated as errors
In file included from vm-engine.c:139,
 from vm.c:315:
vm-i-scheme.c: In function 'vm_regular_engine':
vm-i-scheme.c:437: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:437: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:439: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:439: warning: comparison is always true due to limited range of 
data type
In file included from vm-engine.c:139,
 from vm.c:323:
vm-i-scheme.c: In function 'vm_debug_engine':
vm-i-scheme.c:437: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:437: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:439: warning: comparison is always true due to limited range of 
data type
vm-i-scheme.c:439: warning: comparison is always true due to limited range of 
data type
gmake[3]: *** [libguile_la-vm.lo] Error 1
gmake[3]: Leaving directory `/home/gdt/BUILD-GUILE-master/guile/libguile'
gmake[2]: *** [all] Error 2
gmake[2]: Leaving directory `/home/gdt/BUILD-GUILE-master/guile/libguile'
gmake[1]: *** [all-recursive] Error 1
gmake[1]: Leaving directory `/home/gdt/BUILD-GUILE-master/guile'
gmake: *** [all] Error 2



pgp1dLM9Qr5Zr.pgp
Description: PGP signature


SCM_BOOL_F == 0 and BDW-GC

2009-08-31 Thread Ludovic Courtès
Hello!

Neil Jerram  writes:

> Just checking this because Ludovic said recently that (SCM_BOOL_F ==
> 0) would have nice properties for BDW-GC.

Actually he wasn't quite right when he said that.  :-)

The issue with BDW-GC is that "disappearing links" (weak pointers in
libgc parlance) replace pointers to objects that have been reclaimed by
NULL, and there's no way to tell it to use some other value.

That leads to insanities in the weak hash table implementation [0, 1],
which I thought could somehow vanish if SCM_BOOL_F == 0.

Unfortunately that's not true; it would even make things worse because
NULL would now be a valid Scheme value.

Instead what's really needed is a special pointer-to-reclaimed-object
value that can be distinguished from valid Scheme values since that
value ends up in the car or cdr of weak pairs in hash table buckets.  As
such, SCM_PACK (NULL) was a good choice until now.

SCM_UNDEFINED == 0 would work fine because SCM_UNDEFINED is not a valid
Scheme value, but it wouldn't change the implementation.

Thoughts?

Thanks,
Ludo'.

[0] 
http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/weaks.c?h=boehm-demers-weiser-gc#n40
[1] 
http://git.savannah.gnu.org/cgit/guile.git/tree/libguile/hashtab.c?h=boehm-demers-weiser-gc#n97





Re: truth of %nil

2009-08-31 Thread Ludovic Courtès
Hi,

Ken Raeburn  writes:

> I kind of assumed that making all-bits-zero an invalid value was a
> conscious choice by the Guile (or SCM?) designers which wasn't likely
> to be revisited.  It is, after all, a fairly easy way of highlighting
> a certain class of uninitialized-value problems -- choosing strict
> checking and debugging over letting the programmer be lazy.

Indeed, that could have been one reason.  We could ask Aubrey Jaffer
about this.

> I think I'm mildly in favor of keeping all-bits-zero as an invalid
> representation.  But, if it's a huge win for BDW-GC, maybe it's worth
> it.

As discussed in my other message, it would actually be harmful.

Thanks,
Ludo'.





Re: truth of %nil

2009-08-31 Thread Ken Raeburn

On Aug 31, 2009, at 17:59, Ludovic Courtès wrote:

I think I'm mildly in favor of keeping all-bits-zero as an invalid
representation.  But, if it's a huge win for BDW-GC, maybe it's worth
it.


As discussed in my other message, it would actually be harmful.


Then I'm definitely in favor of keeping it as invalid! :-)

Ken



[BDW-GC] "Inlined" storage; `scm_take_' functions

2009-08-31 Thread Ludovic Courtès
Hello!

Stringbufs and bytevectors are now always "inlined" in the BDW-GC
branch [0, 1], which means that there's no cell->buffer indirection,
which greatly simplifies code (it also takes less room and may slightly
improve performance).

The `scm_take_' functions for strings/symbols/bytevectors are now
essentially aliases to the corresponding `scm_from_' because we cannot
advantageously reuse the provided storage.

Should these functions be deprecated or discouraged?

Thanks,
Ludo'.

[0] 
http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=ba54a2026beaadb4e7566d4b9e2c9e4c7cd793e6
[1] 
http://git.savannah.gnu.org/cgit/guile.git/commit/?h=boehm-demers-weiser-gc&id=0665b3ffcb7ec5232a51ff632a818a638dfd4054





Re: [BDW-GC] "Inlined" storage; `scm_take_' functions

2009-08-31 Thread Mike Gran
On Tue, 2009-09-01 at 02:14 +0200, Ludovic Courtès wrote:
> Hello!
> 
> Stringbufs and bytevectors are now always "inlined" in the BDW-GC
> branch [0, 1], which means that there's no cell->buffer indirection,
> which greatly simplifies code (it also takes less room and may slightly
> improve performance).

Neat!

> 
> The `scm_take_' functions for strings/symbols/bytevectors are now
> essentially aliases to the corresponding `scm_from_' because we cannot
> advantageously reuse the provided storage.
> 
> Should these functions be deprecated or discouraged?
> 

codesearch.google.com says that scm_take_ isn't often used by other
projects, but, it is used by lilypond.  I think that's reason enough to
leave it in.  I'd vote for keeping them and adjusting the docs to say
something like

 Like `scm_from_locale_string' and `scm_from_locale_stringn',
 respectively, but also immediately frees STR after creating
 the Guile string.

Or something like that.

-Mike







Re: Minor queries about Unicode char docs

2009-08-31 Thread Mike Gran
On Mon, 2009-08-31 at 11:21 +0100, Neil Jerram wrote:
> I think there's a case here for making the docstring not identical to
> the corresponding manual text.  In the manual context, the section
> begins with talking about Unicode, so "Unicode" can be assumed for
> everything that follows.  But in the docstring, when someone types
> (help char 
>   Return `#t' iff the code point of `x' is less than the code
>   point of `y', else `#f'.
> 
> For this context I think it would be clearer to say
> 
>   Return `#t' iff the Unicode code point of `x' is less than the
>   code point of `y', else `#f'.

Sounds good.

> 
> +Case-insensitive character comparisons of characters use @emph{Unicode
> +case folding}.  In case folding comparisons, if a character is
> +lowercase and has an uppercase form that can be expressed as a single
> +character, it is converted to uppercase before comparison.  Unicode
> +case folding is language independent: it uses rules that are generally
> +true, but, it cannot cover all cases for all languages.
> 
> That's very clear, but what if a character doesn't have an uppercase
> form that can be expressed as a single character?  Does Guile then
> throw an exception, or does it perform the comparison with the
> lowercase code point?

I see what you mean.  The text should have something like...

"In case folding comparisons, if a character is lowercase and has an
uppercase form that can be expressed as a single character, its
uppercase form is used in the comparison.  All other characters are not
modified for the comparison.  Note that the German letter Sharp S
(Eszett) is not uppercased before the comparison since its plural has
two characters instead of one."

> 
> Thanks!
> 
>  Neil

Thanks,

Mike





Re: Minor queries about Unicode char docs

2009-08-31 Thread Mike Gran
On Mon, 2009-08-31 at 18:40 -0700, Mike Gran wrote:
> Note that the German letter Sharp S
> (Eszett) is not uppercased before the comparison since its plural has
> two characters instead of one."

I meant to say 'its _uppercase form_ has two characters instead of one'.






more compilation failures: -DSCM_DEBUG_TYPING_STRICTNESS=2

2009-08-31 Thread Ken Raeburn

[[ Resending from an account I'm actually subscribed with. ]]

Compiling with SCM_DEBUG_TYPING_STRICTNESS=2 as discussed in __scm.h  
causes SCM to be defined as a union type (though the comments say a  
struct type), which enhances the type checking by making random  
conversions and casts to and from pointer and integer types not work  
without going through the correct conversion macros/functions.


Problem is, we're doing a lot of those.

It also means constant values for static initializers ("{ { BITS } }")  
have a different form from run-time expressions generating certain  
values ("scm_pack (BITS)" calls an inline function), and comparisons  
can't be done with "==" and "!=".  (In fact, tags.h already says "SCM  
values can not be compared by using the operator ==", right above the  
definition of scm_is_eq.)


Guess what we're also doing? :-)
And I haven't even tried compiling Ludovic's bdw-gc-static-alloc  
branch yet, just master.


I can clean some of this up trivially -- SCM_PACK/SCM_UNPACK as  
needed, change == to scm_is_eq.  The initializers make it slightly  
less trivial, and I can imagine different courses of action.


#1: We continue to not support static initialization.  Move most of  
the initializations in the library to the per-file init functions, and  
for stuff like the ra_iproc tables in array-map.c we may want *one*  
internal initializer macro (SCM_I_UNSPECIFIED_INIT or  
SCM_I_UNDEFINED_INIT? maybe even something zero-valued) for filling in  
slots in static structures without getting compiler warnings about  
missing initializers.


#1a: Extend #1 later with whatever internal macros are needed to  
provide the right initialization syntax for constructs used in bdw-gc- 
static-alloc based on the STRICTNESS setting.


#1b: Try to supplement #1 with changes to SCM_PACK or SCM_MAKIFLAG to  
make it not considered a compile-time constant even with STRICTNESS<2  
and thus SCM_UNSPECIFIED, SCM_BOOL_F, etc are never suitable for  
static initialization, catching this problem earlier in the future.  I  
believe a use of a comma expression will suffice, but finding a form  
that doesn't generate compiler warnings and doesn't generate run-time  
code could be tricky. (Though, it becomes easier if we require only no  
performance impact when optimizing and with ... what, inline function  
support? gcc?)


#1c: Try to supplement #1 by defaulting to STRICTNESS=2 on platforms  
where the union is passed and returned the same way as the pointer or  
integer in function calls, and where there isn't a significant  
performance impact. Probably selected via cpp macros in __scm.h, since  
an autoconf feature test would be difficult at best, and still  
specific to the compiler used for building libguile and not the one  
used to build the application.  This helps us avoid the "==" and  
random casting part of the problem better in the future.  Mac OS X  
(10.5, Intel) seems to use the same calling convention both ways in  
one simple test, though I haven't tried performance testing.


#2: Drop STRICTNESS=2 support and really support static initialization  
with the current macros.


#3: Keep STRICTNESS=2 support, and support static initialization, even  
for application code, with a bunch of new macros.


Thoughts?  My preference is for #1 now, and #1a/b/c when convenient or  
needed.


Ken




Re: more compilation failures: -DSCM_DEBUG_TYPING_STRICTNESS=2

2009-08-31 Thread Ken Raeburn

On Sep 1, 2009, at 02:23, Ken Raeburn wrote:
I can clean some of this up trivially -- SCM_PACK/SCM_UNPACK as  
needed, change == to scm_is_eq.  The initializers make it slightly  
less trivial, and I can imagine different courses of action.


Okay, not quite so trivial as I blithely asserted.

It looks like the eval code is going to be annoying too -- lots of  
case labels that are constructed by making SCM values and then  
extracting bits from them with ISYMNUM, which won't work with a  
union.  I'm thinking, maybe an enum or list of macros to define the  
basic set of integers, and then apply SCM_MAKISYM to the enumerator  
values, and then we can refer to the values symbolically without  
extracting bits out of constructed SCM values?


The smob creation macros play fast and loose with types, and accept  
anything that can be cast to scm_t_bits... which doesn't include union  
types like SCM in this mode; extracting values is similarly messy.   
I'm not sure that can be cleaned up without changing the API.


There are also bits that I suspect won't build cleanly if SCM is an  
integer (STRICTNESS=0), too.


Ken