Re: Apoc 5 questions/comments

2002-06-10 Thread Ariel Scolnicov

Damian Conway <[EMAIL PROTECTED]> outlined his plans for world domination:

[...]

> 
> Dammit, you fools! Do I have to think of *everything*??? Just tie him to a 
> steel bench and apply the Ruby laser!
> 
> I do apologize, Mr Wardley. Good evil assistants are just impossible to get
> these days.  
> 

Cut the Smalltalk.  It's off-topic.

-- 
Ariel Scolnicov




Re: [COMMIT] Subs and co-routines in Parrot

2002-06-10 Thread Dan Sugalski

At 12:53 PM +0200 6/9/02, Jerome Vouillon wrote:
>On Sat, Jun 08, 2002 at 03:54:06PM -0400, Melvin Smith wrote:

>The Java bytecode interpreter is clearly not optimized for speed.
>David Gregg, Anton Ertl and Andreas Krall have experimented with an
>improved Java bytecode interpreter.  One of the optimisations they
>perform is to get rid of this copying.  Overall, they get a factor 5
>to 10 improvement on some JavaSPEC benchmarks over other Java bytecode
>interpreters.
>(See "A fast Java interpreter" and "Implementing an efficient Java
>  interpreter" from http://www.complang.tuwien.ac.at/papers/
>  By the way, Anton Ertl has written a lot of other very good papers on
>  bytecode interpreters.)

Right, the Java folks always planned for a JIT, or at least a 
rewrite, and didn't spend that much time optimizing the initial 
interpreter. Which, honestly, we're doing too.

>  > >Yeah, that's too much work for me. I'd rather do something simpler, even
>>  >if that boils down to "we return a single ParrotList with all the return
>>  >values in it, stuck in P0".
>
>Yes, that would be fine.  We can still optimize this later if
>necessary.  The paper "An Efficient Implementation of Multiple Return
>Values in Scheme" by Ashley and Dybvig present a possible
>implementation.
>(http://citeseer.nj.nec.com/ashley94efficient.html)

Yeek. On initial read, that just screams out "Huge Hack". That might 
just be the Scheme bits, though. :)

Some interesting ideas. I'll have to think about it some.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: Stack

2002-06-10 Thread Dan Sugalski

At 11:09 AM -0400 6/6/02, Jason Gloudon wrote:
>
>This seems like a good time to send in this patch:
>
>It allocates the stack content memory using a buffer. This makes the stack
>chunks and the memory used to hold stack contents visible to the garbage
>collector.  One can incrementally add to this to support copy-on-write
>semantics for the chunk contents, which I understand is going to be useful in
>taking continuations. This could be done by making the stack chunks themselves
>buffer headers or perhaps PMCs, which would be cool for introspection.

Applied, thanks.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: Subs for parrot

2002-06-10 Thread Jerome Vouillon

On Sun, Jun 09, 2002 at 05:18:31PM -0400, Dan Sugalski wrote:
> Who says we're only using callcc to capture continuations? We can do 
> it anywhere, so we potentially need the registers stored so we can 
> properly restore state when we're invoked.

I don't understand what you mean.  In scheme, callcc capture the
current continuation and apply it to a function.  If our callcc does
not capture the current continuation, what does it do?

-- Jerome



Re: Apoc 5 questions/comments

2002-06-10 Thread Jonathan Scott Duff

On Sun, Jun 09, 2002 at 03:34:16PM +1000, Damian Conway wrote:
> Trey Harris wrote:
> > rule val {
> > [   # quoted
> >$b := <['"]>
> >( [ \\. | . ]*? )
> >$b
> > ] | # or not
> >(\H+)
> > }
> 
> Not quite. Assigning to $b is a capture. 

I'm confused. The examples in A5 all show $var := (pattern). So are you
saying that parens or no, binding with := affects a capture into
$1,$2,etc.? Or that it affects a capture that alters the return value
of the rule, just not $1,$2,etc.?

A5 says:

Subrules called via  also capture their result in
hypothetical variables. It's possible to name the results of any
<...>, but grammar rules already have a name by default, so you
don't have to give them names unless you call the same rule more
than once. So, presuming you have grammar rules defining "key"
and "value", you can say:

/  \:  { let %hash{$key} = $value } /

So ... should this work?

rule quote  { <["']> }
rule quotedword {  (+) $quote }
$justtheword = //;

And if the above works, why can't "$var:=atom" be a short hand for a
lexical "rule var { atom }" that only applies for the current ... um ...
rule/? And thus the capture would be out
of band WRT $1, $2, etc. or the rule's return value.

-Scott
-- 
Jonathan Scott Duff
[EMAIL PROTECTED]



Re: Apoc 5 questions/comments

2002-06-10 Thread Dave Storrs


I assume that 'fatal.pm' is a new pragma.

1) What (if anything) does it do, aside from turning 'fail' into a fatal
exception when used outside a regex?

2) Do you need to use it before you can (usefully) use 'fail' INSIDE a
regex?  (I would assume not, but thought I'd check.)


Dave



On Fri, 7 Jun 2002, Larry Wall wrote:

> On Fri, 7 Jun 2002, Dave Storrs wrote:
> > Just to be sure I understood:  you meant that (A) yes, you can use
> > fail in a subroutine outside a regex, and (B) if you do, it is no
> > different from die.  Is that correct?
>
> Depends on the caller's use of "use fatal".  If they don't use fatal,
> it returns undef.
>
> Larry
>
>




Re: Apoc 5 questions/comments

2002-06-10 Thread Larry Wall

On Mon, 10 Jun 2002, Dave Storrs wrote:

> 
> I assume that 'fatal.pm' is a new pragma.

Already exists for Perl 5, actually.

> 1) What (if anything) does it do, aside from turning 'fail' into a fatal
> exception when used outside a regex?

What fatal currently does is wrap built-ins that might return undef with code
that will die when undef is returned.  I'm just generalizing that to having
a keyword that fails in whatever way the calling context desires, whether
by returning undef, throwing an exception, or backtracking the current regex.

> 2) Do you need to use it before you can (usefully) use 'fail' INSIDE a
> regex?  (I would assume not, but thought I'd check.)

No, it'll be built-in.  You'll only need to invoke the pragma to change the
defaults.

Larry




Re: Apoc 5 questions/comments

2002-06-10 Thread Dave Storrs


On Mon, 10 Jun 2002, Larry Wall wrote:

> On Mon, 10 Jun 2002, Dave Storrs wrote:
>
> >
> > I assume that 'fatal.pm' is a new pragma.
>
> Already exists for Perl 5, actually.

*blush* Must have missed it.  Drat, and I just finished rereading
Camel III.  Apologies.


Dave




Re: Apoc 5 questions/comments

2002-06-10 Thread Dave Storrs



On Fri, 7 Jun 2002, Luke Palmer wrote:

> > Dave Storrs wrote:
> > Can we please have a 'reverse x' modifier that means "treat whitespace as
> > literals"?  Yes, we are living in a Unicode world now and your data could
> >
> > /FATAL ERROR\:Process (\d+) received signal\: (\d+)/
>
> I don't see how this example is nearly as flexible as this:
>
>   m:w/FATAL ERROR\:  Process (\d+) recieved signal\: (\d+)/
>
> Yours will only match 4 spaces after FATAL ERROR:, whereas mine will match
> any number. [...]
>   I see the :w modifier as a good flexibility enforcement. It will
> keep people away from matching things that very literally.


Respectfully, Luke, I think you and I are discussing separate
issues.  You are talking about the best way to match multiple
whitespace--and I agree with what you're saying, one should never assume
that it will always be 4 spaces instead of 5.  Were I writing that code in
a real project, instead of as a demo for the list, I would use \s+ (in P5,
anyway...in P6, whether I would use \s+ or \h+ would depend on
circumstances).

However, the point I was making was that, if I feel confident in
only handling a limited subset of the possibilities because I know what
I'm going to be getting (because, e.g., I wrote it out myself), then I
would like a way to do away with the visual clutter involved in
backwhacking or entity-izing every bit of whitespace.  Perl has never been
a nanny-language...one of its greatest strengths has always been that it
trusts me to make my own decisions and, if I want to shoot myself in the
foot, I can. :>



The suggestions that other people have been making about defining
subrules and then building them up in order to make the entire match are
good, and in general that's a very powerful technique.  However, the lines
devoted to those subrules still count as visual clutter, and I'd still
like a way to do away with them.

Dave




Re: Apoc 5 questions/comments

2002-06-10 Thread Damian Conway

Jonathan Scott Duff wrote:

> > > rule val {
> > > [   # quoted
> > >$b := <['"]>
> > >( [ \\. | . ]*? )
> > >$b
> > > ] | # or not
> > >(\H+)
> > > }
> >
> > Not quite. Assigning to $b is a capture.
> 
> I'm confused. The examples in A5 all show $var := (pattern). So are you
> saying that parens or no, binding with := affects a capture into
> $1,$2,etc.? Or that it affects a capture that alters the return value
> of the rule, just not $1,$2,etc.?

The latter.


> So ... should this work?
> 
> rule quote  { <["']> }
> rule quotedword {  (+) $quote }
> $justtheword = //;

My understanding is that it won't just return the word. If you invoke a named rule, 
its return value is captured in a hypothetical variable of the same name (but *not* 
into a numbered hypovar -- only parens do that). The named hypovar lives inside the 
object that is ultimately returned to the next level up.

So C returns (what appears to be) a simple string to C, 
but -- because of the captures it does -- C returns an object with 
embedded C<$quote> and C<$alpha> hypovars.


> And if the above works, why can't "$var:=atom" be a short hand for a
> lexical "rule var { atom }" that only applies for the current ... um ...
> rule/? And thus the capture would be out
> of band WRT $1, $2, etc. or the rule's return value.

As explained above, named captures *are* out-of-band wrt $1, $2, etc. 
Just not wrt to the return value.

As I mentioned in a previous post, the issue is how to control what a given
(sub-)rule returns (i.e. all its explicit and captures, or just a specific
result). I think the correct answer is to control that explicitly, via a 
 assertion or a $RETURN:= capture.

Damian



Re: Apoc 5 questions/comments

2002-06-10 Thread Larry Wall

On Sun, 9 Jun 2002 [EMAIL PROTECTED] wrote:
: The parsing of perl 6 is the application of a huge, compiled, regex, correct? 

No, it's a system of compiled regexes which we're calling a grammar.

: In order to parse the new syntax, perl6 is going to have to compile the
: new rule, and stick it in the place of the old one, for the duration of the 
: scope, right?

Doesn't exactly "stick it in place of" except in an abstract sense.
It uses ordinary method overriding to hide the old rule.

: Now what happens to the parser at large if you have dependencies on what has
: changed - ex: if you change the rule for brackets, say so that all '[' are now
: actually '[[' and all ']' are now  ']]'. Won't the whole regex for parsing
: perl need to be recompiled for the duration of the block, or at least the
: dependencies on the things that you changed? And won't *that* be slow and/or
: memory intensive?

No, only the rule in question is compiled, and that only happens once regardless
of how often you invoke the rule.  There might possibly be a compilation phase
when you derive a new grammar from an old one, but that's a tradeoff we can
make when we get to it.  It'd still only happen once for a given grammar.

: And if the rules are somehow abstracted in the perl6 parser/parrot/regex engine
: ,so that each 'rule' is in essence a pointer to the real code corresponding to 
: interpreting that rule (so it can be replaced easily by user defined ones) - 
: well won't that abstraction hurt the performance of parsing regular perl?

If that becomes an issue we can always install a hard-wired lexer/parser as the
base grammar.

: And finally, if the regular expressions are in bytecode to get this flexibility
: as opposed to native machine code, what sort of overhead will this impose on 
: the regex engine?

Er, Perl 5's regexes are in their own pecurliar bytecode, not in
native machine code.  If anything, we'll be better off with Perl 6's
JITable bytecode.

: I know the above might be a bit simplistic, and since its an implementation 
: question I'm posting to perl6-internals instead, but the post is more for the 
: point of clarification about what's going on than anything else. I'd love to 
: see this happen, would use it all the time..

That's our dream.

Larry




A5 - A job well done

2002-06-10 Thread Richard Proctor

Larry,

Wow, that was a very good demolition and rebuilding of the regex edifice.

When the RFCs were being written I spent many hours thinking over some
of the issues and writting many of the RFCs on regexes, trying to build on
what was in perl5, without changing the existing language use.  By allowing
change to that starting point he has done a much better job of it.  (I was
not a novice in this as I had done research in pattern matching at
University many many years ago)

At the time of the RFCs I was employed and hence had more free time to
spend thinking about the design of perl6 than I do at present.  (How is it
that being unemployed I have LESS free time...)

Richard

-- 
Personal [EMAIL PROTECTED]http://www.waveney.org
Telecoms [EMAIL PROTECTED]  http://www.WaveneyConsulting.com
Web services [EMAIL PROTECTED]http://www.wavwebs.com
Independent Telecommunications Consultant, ATM expert, Web Analyst & Services




Re: Subs for parrot

2002-06-10 Thread Dan Sugalski

At 11:31 AM +0200 6/10/02, Jerome Vouillon wrote:
>On Sun, Jun 09, 2002 at 05:18:31PM -0400, Dan Sugalski wrote:
>>  Who says we're only using callcc to capture continuations? We can do
>>  it anywhere, so we potentially need the registers stored so we can
>>  properly restore state when we're invoked.
>
>I don't understand what you mean.  In scheme, callcc capture the
>current continuation and apply it to a function.  If our callcc does
>not capture the current continuation, what does it do?

callcc will call a sub and pass in the current continuation, yes. 
However, we're not limiting the continuation capture point to  spots 
where we callcc. We can capture a continuation anywhere, hence the 
need to capture the registers at the time we capture the 
continuation, since we won't be at a point where the register 
contents are declared volatile.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



[PATCH] packfile reading

2002-06-10 Thread Jason Gloudon


This fixes the problem with reading .pbc files on win32. Someone may want to
write the code to do something useful with the results of stat() when mmap() is
not being used.


Index: assemble.pl
===
RCS file: /cvs/public/parrot/assemble.pl,v
retrieving revision 1.66
diff -u -r1.66 assemble.pl
--- assemble.pl 10 Jun 2002 05:40:06 -  1.66
+++ assemble.pl 10 Jun 2002 23:24:45 -
@@ -813,6 +813,7 @@
   close FILE;
 }
 else {
+  binmode STDOUT;
   print $bytecode;
 }
 
Index: embed.c
===
RCS file: /cvs/public/parrot/embed.c,v
retrieving revision 1.26
diff -u -r1.26 embed.c
--- embed.c 8 Jun 2002 03:38:45 -   1.26
+++ embed.c 10 Jun 2002 23:24:45 -
@@ -110,6 +110,7 @@
 INTVAL read_result;
 
 program_code = (char *)malloc(program_size + 1024);
+program_size = 0;
 if (NULL == program_code) {
 fprintf(stderr,
 "Parrot VM: Could not allocate buffer to read packfile from 
PIO.\n");



Consensus needed...

2002-06-10 Thread Jeff

Tests are now failing because of the removal of the 'inc_n_ic' opcode. I
find this interesting for several reasons. One, the tests probably
should have been removed. Two, once the 'inc' operator has two
parameters, it is no longer 'increment' in my mind. I would call
two-parameter 'inc' two-parameter 'add', as it's no longer the rough
equivalent of '$i++', but '$i+=5' or some such operation.

If anyone would like 'inc_i_ic' and the like to still be called 'inc_',
speak within the next few days or hold your peace until someone else
decides to add them back to CVS. I'll rewrite the tests to 'add_n_ic'
and that ilk.

Opinions? Comments? Concerns?
--
Jeff <[EMAIL PROTECTED]>



Re: [PATCH] packfile reading

2002-06-10 Thread Josh Wilmes

At 19:33 on 06/10/2002 EDT, Jason Gloudon <[EMAIL PROTECTED]> wrote:

> Someone may want to write the code to do something useful with the results
> of stat() when mmap() is not being used.

It's supposed to already do that... did i goof?

--Josh




Re: Consensus needed...

2002-06-10 Thread Dan Sugalski

At 8:17 PM -0400 6/10/02, Jeff wrote:
>If anyone would like 'inc_i_ic' and the like to still be called 'inc_',
>speak within the next few days or hold your peace until someone else
>decides to add them back to CVS. I'll rewrite the tests to 'add_n_ic'
>and that ilk.

Too bad, they lose. :) add is what we'll call it.
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



For August

2002-06-10 Thread Dan Sugalski

Here's the list 'o stuff I'd like to get done for August:

*) Multiple interpreters with inter-interpreter calling done right
*) Threads with multiple independent interpreters
*) Method calls
*) PMC attributes
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk



Re: For August

2002-06-10 Thread Robert Spier


I've updated http://www.parrotcode.org/todo with the latest info from
Dan.


Dan Sugalski writes:
>Here's the list 'o stuff I'd like to get done for August:
>
>*) Multiple interpreters with inter-interpreter calling done right
>*) Threads with multiple independent interpreters
>*) Method calls
>*) PMC attributes
>-- 
> Dan
>
>--"it's like this"---
>Dan Sugalski  even samurai
>[EMAIL PROTECTED] have teddy bears and even
>   teddy bears get drunk

-- 




Stacks, stacks, stacks (And frames)

2002-06-10 Thread Dan Sugalski

(A note--when this says "stack" I really mean all the stacks)

Okay, I've been thinking about stacks and stack frames, and suchlike 
things. Well, calling them "stacks" is a bit of a misnomer, since 
they're really trees, and that's partially where things get nasty. 
Looking at them as trees does make some things clearer.

First, the assumptions:

1) Most parrot code will be machine generated
2) We may have continuations taken and called most any time
3) Subs we call might really be coroutines
4) We want to be fast

So, then, the support. First we can presize the stack frame. We know, 
for most code, how many stack entries will be needed, on the generic 
stack, the register stacks, and the integer stack. Yes, for some code 
we can't tell, but for most we can. Adding a "newframe" op with a 
size will do it for us. newframe should also close off the current 
stack frame. (Which is to say, when we 'newframe' we stop using the 
current frame and set its 'free' count to 0, even if there are still 
some free slots)

Second, we may potentially have to save the contents of the stacks, 
since we might need to reinstate them. The closing properties of 
newframe should help here--we just make sure that we have a closed 
off stack at the point we take a continuation.

The third makes life easier if we guaranteed call a routine, any 
routine, with a closed stack. Since we're potentially passing 
parameters on the stack, this is somewhat problematic. I'm a little 
dodgy here, but I'm thinking that the topmost stack frame on call 
into a function becomes the property of that subroutine, which works 
except in those cases where there are too many entries for a single 
frame, at which point things get a bit odd.

Finally, fast. Right now all the pushes and pops are all cautious, 
checking for space, autoextending, and suchlike things. We can add a 
set of quickpush and quickpop opcodes that don't bother checking in 
those cases where we know there's space. (For example, if we extend 
the stack by 10, the next 10 pushes don't need to check depth, and 
when we preextend the generic stack we can fill in what's needed on 
extend time to minimize what needs to go on the stack)
-- 
 Dan

--"it's like this"---
Dan Sugalski  even samurai
[EMAIL PROTECTED] have teddy bears and even
   teddy bears get drunk