Re: Apoc 5 questions/comments
Damian Conway <[EMAIL PROTECTED]> outlined his plans for world domination: [...] > > Dammit, you fools! Do I have to think of *everything*??? Just tie him to a > steel bench and apply the Ruby laser! > > I do apologize, Mr Wardley. Good evil assistants are just impossible to get > these days. > Cut the Smalltalk. It's off-topic. -- Ariel Scolnicov
Re: [COMMIT] Subs and co-routines in Parrot
At 12:53 PM +0200 6/9/02, Jerome Vouillon wrote: >On Sat, Jun 08, 2002 at 03:54:06PM -0400, Melvin Smith wrote: >The Java bytecode interpreter is clearly not optimized for speed. >David Gregg, Anton Ertl and Andreas Krall have experimented with an >improved Java bytecode interpreter. One of the optimisations they >perform is to get rid of this copying. Overall, they get a factor 5 >to 10 improvement on some JavaSPEC benchmarks over other Java bytecode >interpreters. >(See "A fast Java interpreter" and "Implementing an efficient Java > interpreter" from http://www.complang.tuwien.ac.at/papers/ > By the way, Anton Ertl has written a lot of other very good papers on > bytecode interpreters.) Right, the Java folks always planned for a JIT, or at least a rewrite, and didn't spend that much time optimizing the initial interpreter. Which, honestly, we're doing too. > > >Yeah, that's too much work for me. I'd rather do something simpler, even >> >if that boils down to "we return a single ParrotList with all the return >> >values in it, stuck in P0". > >Yes, that would be fine. We can still optimize this later if >necessary. The paper "An Efficient Implementation of Multiple Return >Values in Scheme" by Ashley and Dybvig present a possible >implementation. >(http://citeseer.nj.nec.com/ashley94efficient.html) Yeek. On initial read, that just screams out "Huge Hack". That might just be the Scheme bits, though. :) Some interesting ideas. I'll have to think about it some. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Stack
At 11:09 AM -0400 6/6/02, Jason Gloudon wrote: > >This seems like a good time to send in this patch: > >It allocates the stack content memory using a buffer. This makes the stack >chunks and the memory used to hold stack contents visible to the garbage >collector. One can incrementally add to this to support copy-on-write >semantics for the chunk contents, which I understand is going to be useful in >taking continuations. This could be done by making the stack chunks themselves >buffer headers or perhaps PMCs, which would be cool for introspection. Applied, thanks. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Subs for parrot
On Sun, Jun 09, 2002 at 05:18:31PM -0400, Dan Sugalski wrote: > Who says we're only using callcc to capture continuations? We can do > it anywhere, so we potentially need the registers stored so we can > properly restore state when we're invoked. I don't understand what you mean. In scheme, callcc capture the current continuation and apply it to a function. If our callcc does not capture the current continuation, what does it do? -- Jerome
Re: Apoc 5 questions/comments
On Sun, Jun 09, 2002 at 03:34:16PM +1000, Damian Conway wrote: > Trey Harris wrote: > > rule val { > > [ # quoted > >$b := <['"]> > >( [ \\. | . ]*? ) > >$b > > ] | # or not > >(\H+) > > } > > Not quite. Assigning to $b is a capture. I'm confused. The examples in A5 all show $var := (pattern). So are you saying that parens or no, binding with := affects a capture into $1,$2,etc.? Or that it affects a capture that alters the return value of the rule, just not $1,$2,etc.? A5 says: Subrules called via also capture their result in hypothetical variables. It's possible to name the results of any <...>, but grammar rules already have a name by default, so you don't have to give them names unless you call the same rule more than once. So, presuming you have grammar rules defining "key" and "value", you can say: / \: { let %hash{$key} = $value } / So ... should this work? rule quote { <["']> } rule quotedword { (+) $quote } $justtheword = //; And if the above works, why can't "$var:=atom" be a short hand for a lexical "rule var { atom }" that only applies for the current ... um ... rule/? And thus the capture would be out of band WRT $1, $2, etc. or the rule's return value. -Scott -- Jonathan Scott Duff [EMAIL PROTECTED]
Re: Apoc 5 questions/comments
I assume that 'fatal.pm' is a new pragma. 1) What (if anything) does it do, aside from turning 'fail' into a fatal exception when used outside a regex? 2) Do you need to use it before you can (usefully) use 'fail' INSIDE a regex? (I would assume not, but thought I'd check.) Dave On Fri, 7 Jun 2002, Larry Wall wrote: > On Fri, 7 Jun 2002, Dave Storrs wrote: > > Just to be sure I understood: you meant that (A) yes, you can use > > fail in a subroutine outside a regex, and (B) if you do, it is no > > different from die. Is that correct? > > Depends on the caller's use of "use fatal". If they don't use fatal, > it returns undef. > > Larry > >
Re: Apoc 5 questions/comments
On Mon, 10 Jun 2002, Dave Storrs wrote: > > I assume that 'fatal.pm' is a new pragma. Already exists for Perl 5, actually. > 1) What (if anything) does it do, aside from turning 'fail' into a fatal > exception when used outside a regex? What fatal currently does is wrap built-ins that might return undef with code that will die when undef is returned. I'm just generalizing that to having a keyword that fails in whatever way the calling context desires, whether by returning undef, throwing an exception, or backtracking the current regex. > 2) Do you need to use it before you can (usefully) use 'fail' INSIDE a > regex? (I would assume not, but thought I'd check.) No, it'll be built-in. You'll only need to invoke the pragma to change the defaults. Larry
Re: Apoc 5 questions/comments
On Mon, 10 Jun 2002, Larry Wall wrote: > On Mon, 10 Jun 2002, Dave Storrs wrote: > > > > > I assume that 'fatal.pm' is a new pragma. > > Already exists for Perl 5, actually. *blush* Must have missed it. Drat, and I just finished rereading Camel III. Apologies. Dave
Re: Apoc 5 questions/comments
On Fri, 7 Jun 2002, Luke Palmer wrote: > > Dave Storrs wrote: > > Can we please have a 'reverse x' modifier that means "treat whitespace as > > literals"? Yes, we are living in a Unicode world now and your data could > > > > /FATAL ERROR\:Process (\d+) received signal\: (\d+)/ > > I don't see how this example is nearly as flexible as this: > > m:w/FATAL ERROR\: Process (\d+) recieved signal\: (\d+)/ > > Yours will only match 4 spaces after FATAL ERROR:, whereas mine will match > any number. [...] > I see the :w modifier as a good flexibility enforcement. It will > keep people away from matching things that very literally. Respectfully, Luke, I think you and I are discussing separate issues. You are talking about the best way to match multiple whitespace--and I agree with what you're saying, one should never assume that it will always be 4 spaces instead of 5. Were I writing that code in a real project, instead of as a demo for the list, I would use \s+ (in P5, anyway...in P6, whether I would use \s+ or \h+ would depend on circumstances). However, the point I was making was that, if I feel confident in only handling a limited subset of the possibilities because I know what I'm going to be getting (because, e.g., I wrote it out myself), then I would like a way to do away with the visual clutter involved in backwhacking or entity-izing every bit of whitespace. Perl has never been a nanny-language...one of its greatest strengths has always been that it trusts me to make my own decisions and, if I want to shoot myself in the foot, I can. :> The suggestions that other people have been making about defining subrules and then building them up in order to make the entire match are good, and in general that's a very powerful technique. However, the lines devoted to those subrules still count as visual clutter, and I'd still like a way to do away with them. Dave
Re: Apoc 5 questions/comments
Jonathan Scott Duff wrote: > > > rule val { > > > [ # quoted > > >$b := <['"]> > > >( [ \\. | . ]*? ) > > >$b > > > ] | # or not > > >(\H+) > > > } > > > > Not quite. Assigning to $b is a capture. > > I'm confused. The examples in A5 all show $var := (pattern). So are you > saying that parens or no, binding with := affects a capture into > $1,$2,etc.? Or that it affects a capture that alters the return value > of the rule, just not $1,$2,etc.? The latter. > So ... should this work? > > rule quote { <["']> } > rule quotedword { (+) $quote } > $justtheword = //; My understanding is that it won't just return the word. If you invoke a named rule, its return value is captured in a hypothetical variable of the same name (but *not* into a numbered hypovar -- only parens do that). The named hypovar lives inside the object that is ultimately returned to the next level up. So C returns (what appears to be) a simple string to C, but -- because of the captures it does -- C returns an object with embedded C<$quote> and C<$alpha> hypovars. > And if the above works, why can't "$var:=atom" be a short hand for a > lexical "rule var { atom }" that only applies for the current ... um ... > rule/? And thus the capture would be out > of band WRT $1, $2, etc. or the rule's return value. As explained above, named captures *are* out-of-band wrt $1, $2, etc. Just not wrt to the return value. As I mentioned in a previous post, the issue is how to control what a given (sub-)rule returns (i.e. all its explicit and captures, or just a specific result). I think the correct answer is to control that explicitly, via a assertion or a $RETURN:= capture. Damian
Re: Apoc 5 questions/comments
On Sun, 9 Jun 2002 [EMAIL PROTECTED] wrote: : The parsing of perl 6 is the application of a huge, compiled, regex, correct? No, it's a system of compiled regexes which we're calling a grammar. : In order to parse the new syntax, perl6 is going to have to compile the : new rule, and stick it in the place of the old one, for the duration of the : scope, right? Doesn't exactly "stick it in place of" except in an abstract sense. It uses ordinary method overriding to hide the old rule. : Now what happens to the parser at large if you have dependencies on what has : changed - ex: if you change the rule for brackets, say so that all '[' are now : actually '[[' and all ']' are now ']]'. Won't the whole regex for parsing : perl need to be recompiled for the duration of the block, or at least the : dependencies on the things that you changed? And won't *that* be slow and/or : memory intensive? No, only the rule in question is compiled, and that only happens once regardless of how often you invoke the rule. There might possibly be a compilation phase when you derive a new grammar from an old one, but that's a tradeoff we can make when we get to it. It'd still only happen once for a given grammar. : And if the rules are somehow abstracted in the perl6 parser/parrot/regex engine : ,so that each 'rule' is in essence a pointer to the real code corresponding to : interpreting that rule (so it can be replaced easily by user defined ones) - : well won't that abstraction hurt the performance of parsing regular perl? If that becomes an issue we can always install a hard-wired lexer/parser as the base grammar. : And finally, if the regular expressions are in bytecode to get this flexibility : as opposed to native machine code, what sort of overhead will this impose on : the regex engine? Er, Perl 5's regexes are in their own pecurliar bytecode, not in native machine code. If anything, we'll be better off with Perl 6's JITable bytecode. : I know the above might be a bit simplistic, and since its an implementation : question I'm posting to perl6-internals instead, but the post is more for the : point of clarification about what's going on than anything else. I'd love to : see this happen, would use it all the time.. That's our dream. Larry
A5 - A job well done
Larry, Wow, that was a very good demolition and rebuilding of the regex edifice. When the RFCs were being written I spent many hours thinking over some of the issues and writting many of the RFCs on regexes, trying to build on what was in perl5, without changing the existing language use. By allowing change to that starting point he has done a much better job of it. (I was not a novice in this as I had done research in pattern matching at University many many years ago) At the time of the RFCs I was employed and hence had more free time to spend thinking about the design of perl6 than I do at present. (How is it that being unemployed I have LESS free time...) Richard -- Personal [EMAIL PROTECTED]http://www.waveney.org Telecoms [EMAIL PROTECTED] http://www.WaveneyConsulting.com Web services [EMAIL PROTECTED]http://www.wavwebs.com Independent Telecommunications Consultant, ATM expert, Web Analyst & Services
Re: Subs for parrot
At 11:31 AM +0200 6/10/02, Jerome Vouillon wrote: >On Sun, Jun 09, 2002 at 05:18:31PM -0400, Dan Sugalski wrote: >> Who says we're only using callcc to capture continuations? We can do >> it anywhere, so we potentially need the registers stored so we can >> properly restore state when we're invoked. > >I don't understand what you mean. In scheme, callcc capture the >current continuation and apply it to a function. If our callcc does >not capture the current continuation, what does it do? callcc will call a sub and pass in the current continuation, yes. However, we're not limiting the continuation capture point to spots where we callcc. We can capture a continuation anywhere, hence the need to capture the registers at the time we capture the continuation, since we won't be at a point where the register contents are declared volatile. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
[PATCH] packfile reading
This fixes the problem with reading .pbc files on win32. Someone may want to write the code to do something useful with the results of stat() when mmap() is not being used. Index: assemble.pl === RCS file: /cvs/public/parrot/assemble.pl,v retrieving revision 1.66 diff -u -r1.66 assemble.pl --- assemble.pl 10 Jun 2002 05:40:06 - 1.66 +++ assemble.pl 10 Jun 2002 23:24:45 - @@ -813,6 +813,7 @@ close FILE; } else { + binmode STDOUT; print $bytecode; } Index: embed.c === RCS file: /cvs/public/parrot/embed.c,v retrieving revision 1.26 diff -u -r1.26 embed.c --- embed.c 8 Jun 2002 03:38:45 - 1.26 +++ embed.c 10 Jun 2002 23:24:45 - @@ -110,6 +110,7 @@ INTVAL read_result; program_code = (char *)malloc(program_size + 1024); +program_size = 0; if (NULL == program_code) { fprintf(stderr, "Parrot VM: Could not allocate buffer to read packfile from PIO.\n");
Consensus needed...
Tests are now failing because of the removal of the 'inc_n_ic' opcode. I find this interesting for several reasons. One, the tests probably should have been removed. Two, once the 'inc' operator has two parameters, it is no longer 'increment' in my mind. I would call two-parameter 'inc' two-parameter 'add', as it's no longer the rough equivalent of '$i++', but '$i+=5' or some such operation. If anyone would like 'inc_i_ic' and the like to still be called 'inc_', speak within the next few days or hold your peace until someone else decides to add them back to CVS. I'll rewrite the tests to 'add_n_ic' and that ilk. Opinions? Comments? Concerns? -- Jeff <[EMAIL PROTECTED]>
Re: [PATCH] packfile reading
At 19:33 on 06/10/2002 EDT, Jason Gloudon <[EMAIL PROTECTED]> wrote: > Someone may want to write the code to do something useful with the results > of stat() when mmap() is not being used. It's supposed to already do that... did i goof? --Josh
Re: Consensus needed...
At 8:17 PM -0400 6/10/02, Jeff wrote: >If anyone would like 'inc_i_ic' and the like to still be called 'inc_', >speak within the next few days or hold your peace until someone else >decides to add them back to CVS. I'll rewrite the tests to 'add_n_ic' >and that ilk. Too bad, they lose. :) add is what we'll call it. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
For August
Here's the list 'o stuff I'd like to get done for August: *) Multiple interpreters with inter-interpreter calling done right *) Threads with multiple independent interpreters *) Method calls *) PMC attributes -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: For August
I've updated http://www.parrotcode.org/todo with the latest info from Dan. Dan Sugalski writes: >Here's the list 'o stuff I'd like to get done for August: > >*) Multiple interpreters with inter-interpreter calling done right >*) Threads with multiple independent interpreters >*) Method calls >*) PMC attributes >-- > Dan > >--"it's like this"--- >Dan Sugalski even samurai >[EMAIL PROTECTED] have teddy bears and even > teddy bears get drunk --
Stacks, stacks, stacks (And frames)
(A note--when this says "stack" I really mean all the stacks) Okay, I've been thinking about stacks and stack frames, and suchlike things. Well, calling them "stacks" is a bit of a misnomer, since they're really trees, and that's partially where things get nasty. Looking at them as trees does make some things clearer. First, the assumptions: 1) Most parrot code will be machine generated 2) We may have continuations taken and called most any time 3) Subs we call might really be coroutines 4) We want to be fast So, then, the support. First we can presize the stack frame. We know, for most code, how many stack entries will be needed, on the generic stack, the register stacks, and the integer stack. Yes, for some code we can't tell, but for most we can. Adding a "newframe" op with a size will do it for us. newframe should also close off the current stack frame. (Which is to say, when we 'newframe' we stop using the current frame and set its 'free' count to 0, even if there are still some free slots) Second, we may potentially have to save the contents of the stacks, since we might need to reinstate them. The closing properties of newframe should help here--we just make sure that we have a closed off stack at the point we take a continuation. The third makes life easier if we guaranteed call a routine, any routine, with a closed stack. Since we're potentially passing parameters on the stack, this is somewhat problematic. I'm a little dodgy here, but I'm thinking that the topmost stack frame on call into a function becomes the property of that subroutine, which works except in those cases where there are too many entries for a single frame, at which point things get a bit odd. Finally, fast. Right now all the pushes and pops are all cautious, checking for space, autoextending, and suchlike things. We can add a set of quickpush and quickpop opcodes that don't bother checking in those cases where we know there's space. (For example, if we extend the stack by 10, the next 10 pushes don't need to check depth, and when we preextend the generic stack we can fill in what's needed on extend time to minimize what needs to go on the stack) -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk