Re: Porting GNU Projects - Guile

2009-07-30 Thread Neil Jerram
bornlibra23  writes:

> Thanks Mike for the heads up but the problem I think is with the source code
> itself. I get the same error on linux though it doesnt die there. I
> preprocessed the code & changed the line like so :
>>From (ch) == ' ' | (ch) == '\t') to (ch) == ' ' || (ch) == '\t')
> (note the or operator) & the problem went away but then I had the mentioned
> problem. I will build it again & get back to you. My build log seems to be
> dated.

FWIW this is fixed in Guile's master branch:

#ifdef HAVE_ISBLANK
# define CSET_BLANK_PRED(c)  (isblank (c))
#else
# define CSET_BLANK_PRED(c) \
   (((c) == ' ') || ((c) == '\t'))
#endif

I guess it was missed for a long time because it is rare to have an OS
that doesn't provide isblank().

Regards,
Neil




Re: Elisp performance

2009-07-30 Thread Neil Jerram
Daniel Kraft  writes:

> Lambda arguments are still always dynamically bound, which is quite a
> pity as it inhibits tail-call optimization;

This prompted me to wonder if using fluids is the best way to
implement dynamic binding.

Perhaps I'm forgetting something basic, but surely it's using
`dynamic-wind' that gives us dynamic binding semantics, not fluids.

(Note that `with-fluids', which is probably the closest thing in Guile
Scheme to a dynamic let, is a combination of `dynamic-wind' and
`fluid-set!'.)

The main thing I believe that makes a fluid different from a normal
variable is that a fluid can have a different value in each thread -
which is not relevant to Elisp.

So what's my point?  I'm not sure, just musing.  As far as performance
and tail-call optimization are concerned, I would guess that the main
thing that needs to be addressed in order to reinstate tail-call
optimization would be the dynamic wind - i.e. the compiler knowing
that it isn't necessary to set up another dynamic-wind frame, it can
just jump with the current variables.

> Iterative prime sieve, (length (find-primes-to 5000)):
>   Scheme: 0.42s
>   Elisp, no void checks, lexical let: 3.40s
>   Elisp, no void checks, dynamic let: 4.43s
>   Elisp, void checks, dynamic let: 5.12s
>   Elisp, void checks, lexical let: 4.06s

As Ken says, it would be good to see an Emacs timing too.

(Also, if I get time, I'll try this with the old translation code
too.  Just for fun, and hopefully for confirmation that the new
approach is much better!)

> My conjecture is that the remaining bottle-neck are the primitive
> calls, as they are still resolved using the dynamic-binding and fluid
> mechanisms; so maybe it is really a good idea to change function
> bindings to just global ones without dynamic scoping, and implement
> flet/flet* with dynamic-wind.  In contrast to variables, for functions
> access performance is a lot more important than let performance, I
> guess, so this might be the way to go.  What do you think?

See above.  I'm not sure why fluid access should be significantly
slower than non-fluid access, so I would guess that the performance
cost is coming from the dynamic-wind part.

Regards,
Neil




Re: review/merge request: wip-array-refactor

2009-07-30 Thread Neil Jerram
Andy Wingo  writes:

> Hi Neil,
>
> Thanks for the review.
>
> On Wed 22 Jul 2009 23:48, Neil Jerram  writes:
>
>> I have two overall questions in mind.
>>
>> - What do you have in mind as regards releasing this?  Even though it
>>   looks good, I think it would be better to let it mature for a while,
>>   and hence not to put it into 1.9.x/2.0.  (And we're not short of new
>>   stuff in 1.9.x/2.0!)
>
> Personally I would prefer that it come out in 2.0. I'm fairly (but not
> entirely) confident of its consistency as it is, and quite confident
> that it is a more maintainable, and hence fixable, codebase.

I could be wrong, but I don't intuitively feel comfortable with that.
It just feels too quick/early.

On the other hand, I think this is really valuable work, and
absolutely don't want an interval of years or months before it gets
out there.

What is our release plan after 2.0?  I don't know.  I'd like something
more dynamic than the very long intervals between major releases that
we've had in the past.  But it seems there is a conflict between

- major releases being the points at which we can break the API/ABI
  (with accompanying documentation)

- wanting to have such releases more frequently than in the past, so
  that good new stuff gets out quicker

- wanting not to create grief for Guile users by changing the API/ABI
  frequently.

Is there a solution?

> The reason that I want it in is that bytevectors are a nice api for I/O,
> but once you have data in memory, often it's best to treat it as a
> uniform array -- be it of u8 color components, or s32 audio samples.
>
> Uniform vectors are almost by nature "in flight" between two places.

(Not sure I agree.  I'd say uniform vectors are mostly holding numbers
in a computation, or for plotting on a graph.)

> But
> it is the bytevector that is most equipped to "fly", via e.g. (rnrs io
> ports); but bytevectors do not provide an ideal API for manipulation.
> s32vector-ref is a better API than bytevector-s32-native-ref, if only
> that the offset of the former is in native units (s32 elements), and of
> the second in bytes.
>
> I guess what I wanted to do was break uniform vectors out of their type
> ghettoes. Like Perlis said, "Pascal is for building pyramids --
> imposing, breathtaking, static structures built by armies pushing heavy
> blocks into place. Lisp is for building organisms -- imposing,
> breathtaking, dynamic structures built by squads fitting fluctuating
> myriads of simpler organisms into place." I should be able to put a
> 64-bit float into two 32-bit words, when I'm writing an instruction
> sequence for a 32-bit machine, without having to change my data type --
> they are just bytes anyway.

That sounds to me like motivation for adding a richer API to
bytevectors (and which could all be in Scheme), not necessarily for
the deep unification of uniform and byte vectors that you have coded.

TBH, with your refactoring up to this point, I still don't have the
overall picture (arrays, uniform vectors, bitvectors, strings etc)
firmly in my head.  I'd like to do that and then reconsider your
points above.

>> - What about the header file incompatibilities?  I.e. the problem that
>>   if there are applications out that that #include ,
>>   or another of the renamed ones, they will stop working.  Do you
>>   think we are OK just to document this well, or should we do more
>>   than that?
>
> I thought that our only support API was that provided by #include
> .

Yeah, actually (and given that Ludovic has said the same) I'm happy
with that too.  Sorry for raising unnecessary concerns!

>>>   (u8vector-ref #u32(#x) 0) => 255

Note that using #x here glosses over the endianness problem.

(I think my inclination at this point is that I'd prefer explicit
conversions.)

> I ought to be able to get at the bits of a packed (uniform) vector. The
> whole point of being a uniform vector is to specify a certain bit layout
> of your data.

Huh?  I would say it is to be able to store numbers with a given range
(or precision) efficiently, and to be able to access them efficiently
from both Scheme and C.

>>>   (u8vector? #u32(#x)) => #f
>
> However, we need to preserve type dispatch:
>
>   (cond
> ((u8vector? x) ...)
> ((s8vector? x) ...)
> ...)
>
>>>   (bytevector? #u32(#x)) => #t
>
> This to me is like allowing (integer? 1) and (integer? 1.0); in
> /essence/ a #u32 is an array of bytes, interpreted in a certain way.

I think you have in mind that all uniform vectors are filled by
reading from a port, or are destined for writing out to a port.

That is an important use, but there is another one: preparing
numerical data for handling in both C and Scheme.  In this use, the
concept of an underlying array of bytes plays no part.

>>> commit 86d88a223c64276e7cd9b4503e7e2ecca5aae320
>>> Author: Andy Wingo 
>>> Date:   Thu Jul 16 21:51:47 2009 +0200
>>>
>>> remove deprecated functions from unif.c
>>> 
>>> * libgu

Re: %nil once again

2009-07-30 Thread Neil Jerram
Clinton Ebadi  writes:

> Why not eliminate %nil/#nil and replace it with ... '(). This would
> allow elisp to reuse all (or at least most) of the Scheme list functions
> without having to muddy Scheme semantics with special #f/#nil handling,
> but would require a very tiny elisp specific truth predicate. Since
> everything should be compiled into an IL instead of Scheme this should
> be straightforward. This would have the rather nice advantage of
> removing all special elisp support code from libguile itself (thus
> cleaning up libguile a bit *and* making the elisp translator a much
> better example for other language translators since every language can't
> expect to have special support from libguile).

Yes.  It's just the Elisp->Scheme data translation point again,
though, as you say:

> This has the slight downside that calling between elisp and Scheme may
> not be so natural. Any elisp that would call Scheme must know it is
> calling Scheme (or simply not-elisp) and that it ought to generate
> Scheme false objects if it is calling a predicate of some sort,

Right.  Elisp code that generates a boolean result, for use by Scheme,
would need to have that result translated from () to #f.  But then
what about Elisp code that generates a vector of boolean results, or
an alist of vectors of boolean results...?

> and any
> Scheme calling elisp should know this and use null? to check for a false
> return value.

The ultimate point of multiple language support is that an application
user can write a hook or configuration tweak in whichever language
they feel comfortable with.  The driving C or Scheme code that calls
out to that hook or tweak needs to work regardless of the user's
language choice.

> The only downside I can see to this is that if code were
> to be ported from elisp to Scheme there would now be code that assumed a
> '() return as false when #f would be false, but this could be fixed
> using a predicate like elisp-false? (or null-or-false? or
> ... something).

I don't like that at all!

In my view, the only reasonable alternative solution above is the one
that would work by translating data passing between Elisp and Scheme.
The theoretical argument against that is that such translation could
be arbitrarily complex, and so have an unbounded performance cost, but
in practice I imagine that would almost never be the case.  Also there
is the point that languages other than Elisp will _require_ data
translation, and why should we make Elisp a special case?  Hence I
wouldn't rule out the possibility of switching to this design in
future; and then we could, as you say, make nil==() and remove most
elisp special cases from the libguile core.

Regards,
Neil




Re: %nil once again

2009-07-30 Thread Neil Jerram
Andy Wingo  writes:

>> Other things needed would be for instance terminating rest-arguments
>> by %nil rather than '() and the like.

Why is that needed?  I.e. Why is it important for a rest argument to
be terminated with %nil instead of with () ?

Neil




Re: [PATCH] %nil-handling optimization and fixes v1

2009-07-30 Thread Neil Jerram
Andy Wingo  writes:

> Hi Mark,
>
> This is also not a patch review yet :)
>
> On Thu 09 Jul 2009 18:11, Mark H Weaver  writes:
>
>> I added the following macros, whose names explicitly state how %nil
>> should be handled.  See the comments in the patch for more information
>> about these.

Hi Mark,

I'm really sorry I haven't reviewed this patch yet, or commented on
your arguments for handling %nil more pervasively.  From skimming the
emails, your arguments look persuasive, but I'm afraid detailed review
will have to wait a bit longer, as I'm going to be on holiday and away
from email for a couple of weeks.  I'm sorry about that, and really
appreciate your work on this area.

Regards,
Neil




Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-1-17-g77332b2

2009-07-30 Thread Ludovic Courtès
Hello!

"Michael Gran"  writes:

> The branch, master has been updated
>via  77332b21a01fac906ae4707426e00f01e62c0415 (commit)
>   from  e5dc27b86d0eaa470f92cdaa9f4ed2a961338c49 (commit)

Oops, I hadn't realized this was in `master'.  Was it intended?  (I
don't remember seeing a discussion, but I may have skipped it.)

> Replace global charnames variables with accessors
> 
> The global variables scm_charnames and scm_charnums are replaced with
> the accessor functions scm_i_charname and scm_i_charname_to_num.
> Also, the incomplete and broken EBCDIC support is removed.

Does it have a user-visible effect?

(If so, please update `NEWS' for 1.9.1->1.9.2.)

>* libguile/print.c (iprin1): use new func scm_i_charname
> 
> * libguile/read.c (scm_read_character): use new func
> scm_i_charname_to_num
> 
> * libguile/chars.c (scm_i_charname): new function
> (scm_i_charname_to_char): new function
> (scm_charnames, scm_charnums): removed

These removals are incompatible in theory, but probably they don't
warrant a `NEWS' entry.  Thoughts?

> +const char *const scm_r5rs_charnames[] = 

Please follow the GCS when it comes to spacing, indentation, etc.  If in
doubt, run GNU Indent.

> +int scm_n_C0_control_charnames = sizeof (scm_C0_control_charnames) / sizeof 
> (char *);

Scary name!  ;-)

Shouldn't it be private?  And shouldn't it be a macro instead?

Thanks,
Ludo'.




Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-1-18-g904a78f

2009-07-30 Thread Ludovic Courtès
"Michael Gran"  writes:

> commit 904a78f11d2d11a58d5df365a44c4fbbd4c96df3
> Author: Michael Gran 
> Date:   Wed Jul 29 06:38:32 2009 -0700
>
> Add 32-bit characters
> 
> This adds the 32-bit standalone characters.  Strings are still
> 8-bit.  Characters larger than 8-bit can only be entered or
> displayed in octal format at this point.  At this point, the
> terminal's display encoding is expected to be Latin-1.

It looks like Unicode is approaching, good news!  :-)

My remark about user-visibility was actually regarding this commit, not
the previous one.

> +#ifndef SCM_WCHAR_DEFINED
> +typedef scm_t_int32 scm_t_wchar;
> +#define SCM_WCHAR_DEFINED
> +#endif

Why is this #ifdef hack needed?

> +#define SCM_CHAR(x) ((scm_t_wchar)SCM_ITAG8_DATA(x))

Please, use GCS style.

> +#define SCM_MAKE_CHAR(x) ({scm_t_int32 _x = (x);\
> +  _x < 0\
> +? SCM_MAKE_ITAG8((scm_t_bits)(unsigned char)_x, scm_tc8_char)   \
> +: SCM_MAKE_ITAG8((scm_t_bits)_x, scm_tc8_char);})

This macro uses a GCC extension, which is not acceptable for Guile.  Can
you please rewrite it in standard C?  (The only risk is multiple
expansion of X, but that's OK.)

Does X < 0 mean ASCII?  And why is it truncated to 8 bits?  A comment
just above indicating the encoding trick would be handy IMO.

(And style isn't OK.)

> +#define SCM_CODEPOINT_MAX (0x10)
> +#define SCM_IS_UNICODE_CHAR(c)  \
> +  ((scm_t_wchar)(c)<=0xd7ff ||  \
> +   ((scm_t_wchar)(c)>=0xe000 && (scm_t_wchar)(c)<=SCM_CODEPOINT_MAX))

Style.

> +  if (i<256)
> +{
> +  /* Character is graphic.  Print it.  */
> +  scm_putc (i, port);
> +}

Style (extraneous braces).

> +VM_DEFINE_INSTRUCTION (42, make_char32, "make-char32", 4, 0, 1)
> +{
> +  scm_t_wchar v = 0;
> +  v += FETCH ();
> +  v <<= 8; v += FETCH ();
> +  v <<= 8; v += FETCH ();
> +  v <<= 8; v += FETCH ();
> +  PUSH (SCM_MAKE_CHAR (v));
> +  NEXT;
> +}

The doc will need to be augmented.

> + ((char? x)
> + (cond ((<= (char->integer x) #xff)
> +`(make-char8 ,(char->integer x)))
> +   (else
> +`(make-char32 ,(char->integer x)

Sounds cool!  :-)

Thanks,
Ludo'.




Re: Reporting unused local variables

2009-07-30 Thread Ludovic Courtès
Hello!

I just committed the changes summarized below that add
`-W unused-variable' to "guile-tools compile".

It appears to work well, but only has approximate source location info
for `define-macro' expansions, for instance (but that's another story).

Daniel: could you try it with the Elisp front-end and report back?

I'm planning to add warnings for possibly unbound variables, and global
bindings unused in the current module and not exported (and eventually
anything Guile-Lint supported).

Thanks,
Ludo'.

  commit 4b856371b3e85cd82f6d637f72bc610d0158b5de
  Author: Ludovic Courtès 
  Date:   Fri Jul 31 00:42:58 2009 +0200

  Add unused variable analysis in the tree-il->glil compiler.

  * module/language/tree-il/analyze.scm (): New record type.
(report-unused-variables): New procedure.

  * module/language/tree-il/compile-glil.scm (%warning-passes): New
variable.
(compile-glil): Honor `#:warnings' from OPTS.

  * test-suite/tests/tree-il.test (call-with-warnings): New procedure.
(%opts-w-unused): New variable.
("warnings"): New test prefix.

  commit 2e4c3227ce1374dd53abd3c7c5797cc64329de91
  Author: Ludovic Courtès 
  Date:   Fri Jul 31 00:06:59 2009 +0200

  Add `(system base message)', a simple warning framework.

  * module/Makefile.am (SOURCES): Add `system/base/message.scm'.

  * module/scripts/compile.scm (%options): Add `--warn'.
(parse-args): Update default value for `warnings'.
(show-warning-help): New procedure.
(compile)[compile-opts]: Add `#:warnings'.
Update help message.

  * module/system/base/compile.scm (compile): Sanity-check the requested
warnings.

  * module/system/base/message.scm: New file.

  commit f4aa0f104b3347c21093b837046022fb7bb6a2ff
  Author: Ludovic Courtès 
  Date:   Thu Jul 30 00:48:04 2009 +0200

  Add `tree-il-fold', a purely functional iterator on `tree-il'.

  * module/language/tree-il.scm (tree-il-fold): New procedure.

  * test-suite/tests/tree-il.test ("tree-il-fold"): New test prefix.





Re: Elisp performance

2009-07-30 Thread Ken Raeburn

On Jul 30, 2009, at 16:18, Neil Jerram wrote:

The main thing I believe that makes a fluid different from a normal
variable is that a fluid can have a different value in each thread -
which is not relevant to Elisp.


Not yet, at least.

And maybe that's enough.  There's other stuff in Emacs besides  
variable bindings that would require dynamic-wind support, like flet,  
save-excursion (preserves current buffer and position), with-output-to- 
string and with-output-to-temp-buffer (preserve 'standard-output'), as  
well as any number of explicit calls to unwind-protect from Elisp code  
to alter and restore other global state of the program (e.g., with set- 
default-file-modes or set-standard-case-table).  The current  
incarnation of these things assumes a single-threaded execution  
model.  So, since we can't make Emacs multithreaded by simply dropping  
Guile Elisp into it, if Emacs is all we're concerned about, maybe  
assuming single-threaded execution only is okay for now.


On the other hand, if we want users to be able to write code snippets  
in Elisp and have them callable from a concurrently-multithreaded  
Scheme program (no Emacs processes in sight), I think we'll want the  
multithreaded support for the basic Elisp language sooner rather than  
later, even if multithreaded Emacs needs much more work.


I also don't know how complicated a switch to stop using fluids would  
be.  If it's not too painful, the performance impact would be  
interesting to see -- does it get us closer to the performance of  
Emacs itself?



See above.  I'm not sure why fluid access should be significantly
slower than non-fluid access, so I would guess that the performance
cost is coming from the dynamic-wind part.


For a program the size of Emacs, I'd be concerned about the numbers of  
fluids and dynamic states created, especially since the space cost  
(and run time cost for growing the vectors, every 20th new entry)  
grows with the product of the two.  The number of dynamic states may  
shrink, but the size of each dynamic state -- the number of allocated  
fluid slots -- doesn't shrink, it only grows.


While they don't really correlate to anything in Guile Elisp directly,  
the default max_specpdl_size (the specpdl stack is where things are  
tracked for unwinding -- previous bindings, unwind-protect calls,  
previous location data for save-excursion, etc) was raised to 1000 a  
few years ago, and the maximum lisp evaluation depth (limits calls to  
eval, funcall, apply) to 400 just a couple years ago.  I assume that's  
because the previous limits (650 and 300 respectively) were  
occasionally getting reached.  That suggests to me rather a lot of  
dynamic states, and probably a large number of fluids.


Ken




Re: [Guile-commits] GNU Guile branch, master, updated. release_1-9-1-17-g77332b2

2009-07-30 Thread Mike Gran
On Fri, 2009-07-31 at 00:57 +0200, Ludovic Courtès wrote:
Hello!
> 
> "Michael Gran"  writes:
> 
> > The branch, master has been updated
> >via  77332b21a01fac906ae4707426e00f01e62c0415 (commit)
> >   from  e5dc27b86d0eaa470f92cdaa9f4ed2a961338c49 (commit)
> 
> Oops, I hadn't realized this was in `master'.  Was it intended?  (I
> don't remember seeing a discussion, but I may have skipped it.)
> 

It was discussed after a fashion.  This is a precursor to the tree
that Andy reviewed, and he seemed to be okay with me committing some of
it.[1]  With all the exciting stuff going on, I was having a little
trouble getting some mindshare. [2] [3] So I pushed this one since
it was kind of your idea anyway. [4]


> > Replace global charnames variables with accessors
> > 
> > The global variables scm_charnames and scm_charnums are replaced
with
> > the accessor functions scm_i_charname and scm_i_charname_to_num.
> > Also, the incomplete and broken EBCDIC support is removed.
> 
> Does it have a user-visible effect?

No change at all in the character names it will accept.  A minor
change on the output: writing U+0012 now gives #\ff over #\np.
Certainly the removal of EBCDIC would be a user-visible effect if
there were an EBCDIC user. 

> 
> (If so, please update `NEWS' for 1.9.1->1.9.2.)
> 
> >* libguile/print.c (iprin1): use new func scm_i_charname
> > 
> > * libguile/read.c (scm_read_character): use new func
> > scm_i_charname_to_num
> > 
> > * libguile/chars.c (scm_i_charname): new function
> > (scm_i_charname_to_char): new function
> > (scm_charnames, scm_charnums): removed
> 
> These removals are incompatible in theory, but probably they don't
> warrant a `NEWS' entry.  Thoughts?

If they were API, they weren't well documented as such.  They look like
internal information to me.

> 
> > +const char *const scm_r5rs_charnames[] = 
> 
> Please follow the GCS when it comes to spacing, indentation, etc.  If
in
> doubt, run GNU Indent.

OK.

> 
> > +int scm_n_C0_control_charnames = sizeof
(scm_C0_control_charnames) / sizeof (char *);
> 
> Scary name!  ;-)
> 
> Shouldn't it be private?  And shouldn't it be a macro instead?

OK.

> 
> Thanks,
> Ludo'.

Footnotes: 
[1] http://lists.gnu.org/archive/html/guile-devel/2009-06/msg00084.html

[2]  http://lists.gnu.org/archive/html/guile-devel/2009-07/msg00163.html

[3]  http://lists.gnu.org/archive/html/guile-devel/2009-07/msg00077.html

[4]  http://lists.gnu.org/archive/html/guile-devel/2009-02/msg00073.html






Re: %nil once again

2009-07-30 Thread Daniel Kraft

Hi Neil,

Neil Jerram wrote:

The only downside I can see to this is that if code were
to be ported from elisp to Scheme there would now be code that assumed a
'() return as false when #f would be false, but this could be fixed
using a predicate like elisp-false? (or null-or-false? or
... something).


I don't like that at all!

In my view, the only reasonable alternative solution above is the one
that would work by translating data passing between Elisp and Scheme.
The theoretical argument against that is that such translation could
be arbitrarily complex, and so have an unbounded performance cost, but
in practice I imagine that would almost never be the case.  Also there
is the point that languages other than Elisp will _require_ data
translation, and why should we make Elisp a special case?  Hence I
wouldn't rule out the possibility of switching to this design in
future; and then we could, as you say, make nil==() and remove most
elisp special cases from the libguile core.


I agree here; though I think we should get translation to a minimum as 
much as possible; for instance, I still think the Guile core should 
treat %nil as false and empty list if asked for.  This does not hurt 
Scheme, as %nil does not occur there, but it would allow us to just use 
TreeIL conditionals and the like without having to translate here.


On the other hand, if we want elisp semantics as faithfully as possible, 
we need to translate the way from Guile primitives to elisp code. 
Currently I'm already doing this for booleans; for instance, the = 
built-in of emacs is defined something like:


(lambda (a b)
  (if (= a b)
t
%nil))

so that the "false" value is really %nil.  On a later thought, this 
might be the real reason why my "tight while loop" performed worse than 
the Scheme version in the performance test.  But I don't see a way how 
to get around this, unfortunatly; except if we just assume that the 
elisp user won't test a false value for (eq value nil) but just use it 
as boolean in conditionals, too, so it does not matter if the value is 
%nil or #f.


The same applies to lists...  In general, one can write probably most 
elisp code without having to really care if the tail of a rest argument 
or the result from the list built-in or some quotation-expression is 
%nil or '() (and should actually check with null instead of (eq nil), I 
think).  At the moment it is like that, there's no '() <--> %nil 
translation happening by now.  As long as libguile/VM gets the patch to 
accept %nil as end-of-list (as the interpreter already does), one can 
still do (cons 'a (cons 'b nil)) in elisp and get the expected result...


I'm still not sure if we should actually do data translation from Scheme 
(Guile built-ins) to elisp user code or not...  But probably we should 
just do it and the performance hit will not be too painful.


Daniel

--
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




Re: %nil once again

2009-07-30 Thread Daniel Kraft

Neil Jerram wrote:

Andy Wingo  writes:


Other things needed would be for instance terminating rest-arguments
by %nil rather than '() and the like.


Why is that needed?  I.e. Why is it important for a rest argument to
be terminated with %nil instead of with () ?


Well, I don't think it is "important" at all, but it would be needed for 
"complete elisp" semantics, just because it is a list and lists should 
be terminated by %nil in elisp.  Someone might try to check for the 
end-of-list with (eq tail nil) instead of (null tail) and it should 
still work.


BTW, for an empty rest argument the elisp documentation explicitly 
states it "should" be nil (well, but that's obviously so because nil is 
the empty list).


Daniel

--
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




Re: Elisp performance

2009-07-30 Thread Daniel Kraft

Ken Raeburn wrote:
Obviously, it would help a lot to do so.  On the other hand, switching 
to primitive-ref's would help even more, but I fear we can not easily 
do so, because we can not know if a symbol targets a primitive or was 
rebound at compile time...  BTW, a quick test with Scheme:

[]
So it seems that the Scheme compiler just ignores this possibility... 
Is (set! + ...) and expecting (+ 1 2) to change invalid, or should 
this strictly be regarded as a bug?


In general I don't think you can assume it for all symbols, but if it 
helps, the Emacs byte-compiler also assumes that "+" and "cons" and 
certain other functions won't be redefined.  It's even got an "add1" 
opcode.


So if I understand right, if you make similar assumptions and change how 
function bindings are handled, your performance for this code drops to 
under a second?  It sounds like maybe you can get within shouting 
distance of Emacs's own performance already, at least for certain test 
cases.


Well, that's partially true.  For those built-ins that map directly to 
Guile primitives, it would probably be an advantage to build 
make-primitive-ref's for the generated TreeIL code directly; I'm not 
sure if that's done at the moment, but in the future this will help for 
things like optimization and special op-codes (e.g. add1).


I think it would really be reasonable to assume certain symbols don't 
get rebound, if they don't have a different lexical binding at the 
moment; although I think that the concept of dynamic binding is actually 
also about the ability to rebind even built-ins to allow for changes... 
 For instance, what if I wanted to write a program that evaluates the 
"efficiency" of some numerical algorithm by overloading + and * to count 
the number of operations performed?  This seems like a valid need to me 
(in fact, I might be doing something similar for my Bachelor's thesis; 
though probably not in Scheme or elisp, so this does not directly matter 
here).


So my idea was to provide a compiler option to always use an ordinary 
function call for certain or all primitives as a compromise; that sounds 
like a quite good idea to me catering for both needs.


However, as a side-note:  I don't think my code would drop below one 
second if this was implemented (hm, at least I'm not sure), because for 
instance all built-ins returning booleans (like < in the example) can 
not map directly to Guile primitives because I need to translate #f to 
%nil inbetween...  It's a pity because comparisons are probably quite 
common especially in such loops, but if we don't want to get rid of 
translation and don't care about #f in elisp (see my other post in the 
%nil thread), I see no way around this.


Would this interfere with possibly blending Scheme GOOPS code with Elisp 
someday?  Or is the generic support there at a lower level than this?  
(E.g., a "marker" object holds a buffer handle, possibly nil, and an 
offset that automatically gets adjusted if text is inserted before it.  
You can use "(+ 1 marker)" and get back an integer one greater than the 
marker's current offset.  If markers were implemented using GOOPS, would 
this use of "+" work, given the changes you're suggesting?)


To be honest, I've nearly no idea about GOOPS so far and thus can't 
comment here...


Yours,
Daniel

--
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri




Re: Elisp performance

2009-07-30 Thread Daniel Kraft

Hi Neil,

Neil Jerram wrote:
Daniel Kraft  writes: 

Lambda arguments are still always dynamically bound, which is quite a
pity as it inhibits tail-call optimization;


This prompted me to wonder if using fluids is the best way to
implement dynamic binding.

Perhaps I'm forgetting something basic, but surely it's using
`dynamic-wind' that gives us dynamic binding semantics, not fluids.


I also thought about this yesterday; and I think you're right, just as I 
propose to do for function bindnigs we could also do for values. 
Somehow I had the impression in the back of my head that fluids are 
meant to provide dynamic binding as we need it and with-fluid* does some 
"magic" to do it most efficiently.


However, as I now see it, the dynamic state is meant to apply to threads 
and not between different with-fluid*'s, right?  So instead of 
with-fluid* using just a dynamic-wind will be faster (or at least on 
par)...?


If so, I think we should get rid of the fluids and switch to directly 
using dynamic-wind.  The point about future multi-threading might be 
interesting, though...  If we did switch elisp to multi-threading at 
some point, what would become of dynamic binding?  I guess we can't do 
this across threads, so each dynamically bound variable would also be 
thread local?  I think this is the most reasonable concept, trying to 
make dynamic bindnig work across threads looks really messy to me. 
Then, the fluid concept will be just what we need again;  but we 
probably also want to find a way for shared global variables -- which 
has time until then, of course ;)


Another thing:  If we use dynamic-wind directly instead of the current 
fluid system, we could get rid of void as special value and instead just 
unbind the symbol from our module in case of void; then, on access Guile 
would complain itself and we wouldn't need the special checks for void. 
 With dynamic-wind, we could ensure that variables are re-bound or 
unbound to their previous state when leaving the dynamic scope.  This 
rebinding would, however, be more expensive as we would have to care for 
a lot more special cases.  With regards to void, I like the current 
implementation with the possibility to disable the access check if not 
needed and then get full speed.  So I'm not sure if I would change void 
handling at all even if we switched away from fluids and it became possible.



(Note that `with-fluids', which is probably the closest thing in Guile
Scheme to a dynamic let, is a combination of `dynamic-wind' and
`fluid-set!'.)


Yes, I agree; and with-fluids is quite nice to use, also.  But as the 
compiler handles dynamic binding in our case, I also don't care about 
explicitly setting/reverting the variables with dynamic-wind.  If 
with-fluids uses dynamic-wind internally, we can only win on performance 
and all we lose seems to be the thread-locality for the future.  But 
this may still be a point, I'm not sure...



So what's my point?  I'm not sure, just musing.  As far as performance
and tail-call optimization are concerned, I would guess that the main
thing that needs to be addressed in order to reinstate tail-call
optimization would be the dynamic wind - i.e. the compiler knowing
that it isn't necessary to set up another dynamic-wind frame, it can
just jump with the current variables.


Hm, I'm not sure if I got your point correctly, but I think that dynamic 
binding and tail-call optimization are difficult to combine in general, 
no matter if the dynamic binding is done with fluids or dynamic-wind.


As you said, the point is about the compiler or VM handling the 
"unwinding" (i.e. resetting changed dynamic bindings to their old 
values) and doing this in a way to allow tail-call optimization.  I 
don't think a tail-call is possible in general at all if the caller's 
body is executed inside a dynamic wind (well, at least for recursive 
calls, each level will have to create it's own unwinding-context and 
thus grow the stack despite tail-calls)...?  I'm not very much into 
theoretical considerations in this topic, so maybe you know more about 
if/how/when tail optimization is possible together with dynamic binding 
or dynamic-wind; but some ideas below.


For dynamic binding however, I think one could do tail-call 
optimization, at least in some cases like when the callee binds a 
superset of the symbols bound by the callee (which would suffice for 
optimization of recursive calls).  But then we would need to do dynamic 
binding in some other way than with dynamic-wind (i.e. with some 
stack-like data structure where we can push/pop values for the symbols) 
and also have to know about tail-calls in the elisp compiler directly; I 
don't think we should go for this...  When I implement the option to 
"always" bind certain symbols lexically, at least for lambda's with 
lexically bound arguments tail-call optimization will again be available 
for free.


But if we want to work on this, a not-so-bad idea could be thinking 
about

Re: Elisp performance

2009-07-30 Thread Daniel Kraft

Ken Raeburn wrote:
 > And maybe that's enough.  There's other stuff in Emacs besides variable
bindings that would require dynamic-wind support, like flet, 
save-excursion (preserves current buffer and position), 
with-output-to-string and with-output-to-temp-buffer (preserve 
'standard-output'), as well as any number of explicit calls to 
unwind-protect from Elisp code to alter and restore other global state 
of the program (e.g., with set-default-file-modes or 
set-standard-case-table).  The current incarnation of these things 
assumes a single-threaded execution model.  So, since we can't make 
Emacs multithreaded by simply dropping Guile Elisp into it, if Emacs is 
all we're concerned about, maybe assuming single-threaded execution only 
is okay for now.

>
On the other hand, if we want users to be able to write code snippets in 
Elisp and have them callable from a concurrently-multithreaded Scheme 
program (no Emacs processes in sight), I think we'll want the 
multithreaded support for the basic Elisp language sooner rather than 
later, even if multithreaded Emacs needs much more work.


As already stated upthread, I'm not sure about this myself...  It would 
be cool to stay as flexible as possible with regards to future 
multi-threading, but on the other hand I also like the idea of getting 
rid of the fluids.  My timings there seem to suggest that fluids at 
least have "some" performance hit.


Of course, anyone interested in performance can also use lexical-let 
instead of let and also get rid of all this performance problems as well 
;)  But the same argument may hold for the threading argument, too, so 
if you want to write elisp code that is called from multi-threaded 
Scheme, just avoid dynamic binding and you'll get no problems...


I also don't know how complicated a switch to stop using fluids would 
be.  If it's not too painful, the performance impact would be 
interesting to see -- does it get us closer to the performance of Emacs 
itself?


Well, it will not be trivial but also quite doable, I think.  Regarding 
the performance, I'm quite sure it will help, but not so much about the 
actual quantitative impact -- but if we're lucky, it might be like that 
between dynamic and lexical let of my timings in the original post. 
This seems quite plausible to me.


Daniel

--
Done:  Arc-Bar-Cav-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Kni-Mon-Pri