[RELEASE] Parrot 0.0.5 is out of its cage.
It was the dawning of the second age of parrotkind, ten weeks after the great GC war. The Parrot Project was a dream given form. Its goal: To prevent language wars by creating an interpreter where perl and other languages could reside peacefully... It can be a dangerous place, but it's our last best hope for peace. This is the story of the latest of the Parrot releases. The year is 2002. The name of the tarfile is Parrot 0.0.5. [Apologies to J. Michael Straczynski] The Parrot Team is pleased to announce the release of Parrot 0.0.5, soon to be available on your local CPAN mirror as: CPAN/authors/id/J/JG/JGOFF/parrot-0.0.5.tar.gz >From the NEWS file: New in 0.0.5 - Full GC - Perl Scalar support in PMCs - Array and Hash types almost ready for prime-time - Internal support for keyed types - EMACS editing mode - New PDDs - New Language - BASIC - Regular expression compiler - More tests - Many, many bug fixes, enhancements, and speedups I'd personally like to thank everyone on the Parrot development team for contributing to what's turned out to be a great release. Parrot is finally starting to look like a stable platform for development, especially with the attention paid to GC and string issues. The BASIC language gives people more toys to play with, and Perl scalars help prove that we can indeed handle Perl6 when the design process is finished. Over the next few days, expect an updated roadmap as to where we see Parrot going. Complete support for keyed PMC types will be one of the first items to be checked off, followed shortly by regular expressions and symbol tables. If you want to join in on the fun, start by downloading a copy of parrot-0.0.5 at CPAN at one of the following URLs (or a mirror): http://www.cpan.org/authors/id/J/JG/JGOFF/parrot-0.0.5.tar.gz http://www.cpan.org/src/parrot-0.0.5.tar.gz To get the latest CVS version: http://cvs.perl.org/ has the information you need. Once you've unpacked Parrot, build it as follows: perl Configure.pl make make test After you've done that, look at docs/parrot.pod to learn more about it. Discussion of Parrot takes place on the perl6-internals mailing list, and patches should be sent there as well. If you're not subscribed, look at: http://lists.perl.org/showlist.cgi?name=perl6-internals for tips on how to subscribe. CVS commit access is given out to developers who consistently submit good patches to the mailing list. Have fun, and hack well. -- Jeff <[EMAIL PROTECTED]>
[RELEASE] Parrot 0.0.6 Leaves The Village
"Where am I?" "In the CPAN." "What do you want?" "Keyed acess." "Whose side are you on?" "That...would be telling. We want...keyed access." "You won't get it." "By hack or by crack... We will." "Who are you?" "The new pumpking." "Who is number 2?" "You are Version Six." "I am not a version, I am the final release!" [apologies to Patrick McGoohan] Welcome to version 0.0.6 of Parrot. Major changes in this release include a new assembler supporting the keyed access syntax, new macro syntax, new Configure scripts, a Parrot assembler written in Parrot, the C#-like language 'cola' with limited OOP support, and lots of new documentation. Some contributions include tetris.pasm and an implementation of LZW compression. As per usual, if you want to join in on the fun, start by downloading a copy of parrot-0.0.5 at CPAN at one of the following URLs (or a mirror): http://www.cpan.org/authors/id/J/JG/JGOFF/parrot_0.0.6.tgz http://www.cpan.org/src/parrot_0.0.6.tgz To get the latest CVS version: http://cvs.perl.org/ has the information you need. Once you've unpacked Parrot, build it as follows: perl Configure.pl make make test After you've done that, look at docs/parrot.pod to learn more about it. Discussion of Parrot takes place on the perl6-internals mailing list, and patches should be sent there as well. If you're not subscribed, look at: http://lists.perl.org/showlist.cgi?name=perl6-internals for tips on how to subscribe. CVS commit access is given out to developers who consistently submit good patches to the mailing list. Be Seeing You. -- Jeff <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
[PRE-RELEASE] Release of 0.0.7 tomorrow evening
As the message says. Code freeze tonight at midnight EDT (GMT-0400). I'll be tagging with PRE_REL_0.0.7 then. Features to be included: Perl 6 grammar Partial perl6 compiler Pure-perl assembler Heavily patched and upgraded intermediate language Massive patching in general, cleaned-up PMCs. FORTH :) If anyone can give me a good reason why we shouldn't release (short of show-stopping bugs &c) then speak before 2000GMT or until tomorrow hold h(er|is) peace. -- Jeff <[EMAIL PROTECTED]>
[RELEASE] Parrot 007: Secret Agent Bird
"There's a bird who leads a life of hacking >From everyone he meets / He gets some backing With every patch he takes / Another build we make Odds are he won't be the same tomorrow Secret Agent bird / Secret agent bird They've built the perl 6 grammar / And taken away a kluge" "Do you expect me to squawk()?" "No, Mr. Parrot. I expect you to die()." Apologies to Johnny Rivers and Ian Fleming. Welcome to version 0.0.7 of Parrot. The big news: Perl 6 grammar and a small but functional compiler! Check out languages/perl6 and the tests, but make sure to build languages/imcc first. Functional subroutine, coroutine, and continuation PMCs Global variables Intermediate bytecode compiler (languages/imcc) Assembler now entirely in perl, no more PakFile2.xs Working GC As per usual, if you want to join in on the fun, start by downloading a copy of parrot-0.0.7 at CPAN at one of the following URLs (or a mirror): http://www.cpan.org/authors/id/J/JG/JGOFF/parrot-0.0.7.tgz http://www.cpan.org/src/parrot-0.0.7.tgz To get the latest CVS version: http://cvs.perl.org/ has the information you need. Once you've unpacked Parrot, build it as follows: perl Configure.pl make make test After you've done that, look at docs/parrot.pod to learn more about it. Discussion of Parrot takes place on the perl6-internals mailing list, and patches should be sent there as well. If you're not subscribed, look at: http://lists.perl.org/showlist.cgi?name=perl6-internals for tips on how to subscribe. CVS commit access is given out to developers who consistently submit good patches to the mailing list. "The name is Parrot. Percy Parrot." -- Jeff <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
[RELEASE] Parrot 0.0.8 Codename: Pieces of Eight
Ooo I need your code, babe Guess you know it's true Hope you need this build babe Just like I need you -- Apologies to John Lennon (alternate codename: Octarine) News collected from Piers Cawley's excellent summaries: Working Perl6 REs Multidimensional keyed access JIT for the ARM Lexical scope operators And many bug fixes and smaller subsystems. Please note that due to the new (soon to be utilized) Unicode data files from the ICU project, Parrot has upgraded from a parrotlet to an Amazon. Please be gentle when utilizing the CVS server (compression is more important now, '-z3' from the command line. As per usual, if you want to join in on the fun, start by downloading a copy of parrot-0.0.8 at CPAN at one of the following URLs (or a mirror): http://www.cpan.org/authors/id/J/JG/JGOFF/parrot-0.0.8.tgz http://www.cpan.org/src/parrot-0.0.8.tgz To get the latest CVS version: http://cvs.perl.org/ has the information you need. Once you've unpacked Parrot, build it as follows: perl Configure.pl make make test After you've done that, look at docs/parrot.pod to learn more about it. Discussion of Parrot takes place on the perl6-internals mailing list, and patches should be sent there as well. If you're not subscribed, look at: http://lists.perl.org/showlist.cgi?name=perl6-internals for tips on how to subscribe. CVS commit access is given out to developers who consistently submit good patches to the mailing list. "The name is Parrot. Percy Parrot." -- Jeff <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>
[POST-RELEASE] parrot-0.0.8.1.tgz fixes a slight MANIFEST bug...
The 'DEVELOPING' file accidentally made its way into the MANIFEST, but doesn't actually exist in the tarball. It's not a problem, as you can delete the appropriate line in the MANIFEST and continue, but given the large file size I thought I should alert you. 0.0.8.1 is being uploaded at the moment that doesn't have this minor problem. -- Jeff <[EMAIL PROTECTED]>
Re: [FWP] sorting text in human-order
> On Fri, Jan 05, 2001 at 09:42:12PM -0500, Brian Finney wrote: > > say we start with this number > > 123,456,789 > > > > one hundred twenty-three million four hundred fifty-six thousand seven hundred > > eighty-nine > > satakaksikymmentäkolme miljoonaa neljäsataaviisikymmentäkuusi tuhatta > seitsemänsataakahdeksankymmentäyhdeksän. Or 1,2345,6789 ichi oku ni-sen sambyaku yon-ju go man roku sen nana hyaku hachi-ju kyu. or one one-hundred-million, two thousand three hundred forty-five ten-thousand, six thousand seven hundred eighty nine. Why are we trying to teach a computer language about natural languages? Jeff
Re: Larry's Apocalypse 1
> The > timescales of corporations like Sun are not the same as those commonly > encountered in the open software arena. Ditto for HP. Jeff
Re: Larry's Apocalypse 1
> > > The > > > timescales of corporations like Sun are not the same as those commonly > > > encountered in the open software arena. > > > > Ditto for HP. > > Which is more extreme (HP9000/L1000, HP-UX 11.00 + March 2001 patches): > > % /usr/contrib/bin/perl -v > > This is perl, version 4.0 > > $RCSfile: perl.c,v $$Revision: 4.0.1.8 $$Date: 1993/02/05 19:39:30 $ > Patch level: 36 "Top men" are working on this problem. Stay tuned. Jeff
Re: durians
> Grocers either stock [durians] outside or frozen. And I believe there are laws in some of the SE Asian countries where they are more common that makes getting on public transportation with a durian illegal. Jeff
perl6-language@perl.org
I have to wonder how many other people just edited /usr/share/games/fortune/perl and added: % Humans are not much into strong compile-time typing, and when they are, we call it stereotyping, or racism, or whatever. -- Larry Wall in <[EMAIL PROTECTED]> And now back to your regularly scheduled meaningful discussion already in progress. -- Jeff Stampes [ [EMAIL PROTECTED] ] -- Build and Release Tools The older a man gets, the farther he had to walk to school as a boy.
"use" semantics
i opened RT #61742 because the semantics of the "use" statement in rakudo changed in such a way that i could no longer precompile mod_perl6 modules. "use" statements are now being invoked during PAST generation, which requires any "used" modules to be in the @INC path and error-free. moritz informed me that this was the correct behavior (which is now obvious to me), and i was able to fix mod_perl6, but it raised an interesting point. if a module expects conditions only present at *runtime*, you can never precompile that module, as the compile-time "use" will fail. this isn't a problem for regular command-line scripts, but what about code such as a mod_perl6 handler? the modules it uses assume they're embedded in an apache process, dlfunc'ing a bunch of apache API functions. right now it is impossible to compile such a module to bytecode, which i like to do for startup performance. i realize that "use" needs to load modules early, but i think there needs to be a distinction so such modules aren't executed out of context. maybe there's an obvious way around this, maybe this is a new edge case. thoughts? -jeff
Re: The Block Returns
Speaking to the practical side, I have written code that has to disentangle itself from the failure of a complex startup sequence. I'd love to be able to build a dynamic exit sequence. (In fact, being able to do &block .= { more_stuff(); }; is way up on my list...) I've wanted to do that sort of thing before, but it seems simpler (conceptually and practically) to build up an array of cleanup subs/blocks to execute in sequence, rather than to have a .= for blocks. (Another reason it's handy to keep them separate is in cases in which each needs to return some information--maybe a status which determines whether to proceed, etc.) JEff
Re: Returning from Rules
On Apr 19, 2004, at 12:06 AM, Luke Palmer wrote: Therefore, the first syntax can be redefined to evaluate the code block and assign the result to $0. Would you ever want to leave $0 unaltered? That's the only concern which comes to mind. My argument for using this notation stems from the fact that it would be a royal pain to write subs like: sub add ($a, $b) { $RET = $a + $b; } I think Pascal does something like this. JEff
Re: S5 updated
On Sep 22, 2004, at 5:06 PM, Edward Peschko wrote: How do you do that? Generation and matching are two different things algorithmically. yes, but they are intimately linked. just like the transformation of a string into a number, and from a number to a string. Two algorithmically different things as well, but they'd damn-well better be exact inverses of the other. But they're not: " 3 foo" --> 3 --> "3" My point is that if inputting strings into grammars is low level enough to be an op, why isn't generating strings *from* grammars? Maybe, because it's a less common thing to want to do? (Which is a bit ironic, since technically grammars are typically characterized as sets of rules for how to generate all the acceptable strings of the language they define, and parsing is sort of running that in reverse.) But you seemed to be saying (to which Luke replied the "How do you do that?" above) that they should somehow share an implementation, so that they can't accidentally diverge. But algorithmically it seems they can't share an implementation, so making them both fundamental ops doesn't achieve the goal of ensuring parity. JEff
Re: S5 updated
On Sep 23, 2004, at 5:27 PM, Edward Peschko wrote: (B (B> On Thu, Sep 23, 2004 at 08:15:08AM -0700, Jeff Clites wrote: (B>>> (B>>> just like the transformation of a string into a number, and from a (B>>> number to a string. Two algorithmically different things as well, (B>>> but they'd damn-well better be exact inverses of the (B>>> other. (B>> (B>> But they're not: (B>> (B>> " 3 foo" --> 3 --> "3" (B> (B> I'd say that that's a caveat of implementation, sort of a side effect (B> of handling (B> an error condition. (B (BNope, I'd call it fundamental semantics--it allows common idioms such (Bas "0 but true" in Perl5, for example. It's just an explicit part of (Bthe rule for how Perl (and C's strtol/atoi functions) assign numerical (Bvalues to strings. (B (BBut you might like this example better, which I assume will work in (BPerl6: (B (B "$B#3(B" --> 3 --> "3" (B (B(In case your email viewer doesn't render that, the first string (Bcontains the "fullwidth digit three", a distinct, wider version of a 3, (Bused in some Asian languages.) (B (B> By your criteria there are very few inverses - you could say that (B> multiplication isn't an inverse of division because of zero, for (B> example. (B (BI'm reacting here to your saying, "exact inverses". But for this (Bexample, it's not my criteria--to a mathematician, multiplication over (Bthe real numbers (or over integers) is in fact not invertible. (B (B> If you add the further caveat that everything in the string to be (B> converted has to be an integer, then they *are* direct inverses. (B (BYes, the operation is invertible, if restricted to a domain over which (Bit's invertible (B (B>>> My point is that if inputting strings into grammars is low level (B>>> enough to be an op, why isn't generating strings *from* grammars? (B>> (B>> Maybe, because it's a less common thing to want to do? (B>> (B> Well, there re two responses to the "that's not a common thing to want (B> to do": (B> (B> 1) its not a common thing to want to do because its not a useful (B> thing to do. (B> 2) its not a common thing to want to do because its too damn (B> difficult to do. (B> (B> I'd say that #2 is what holds. *Everybody* has difficulties with (B> regular (B> expressions - about a quarter of my job is simply looking at other (B> people's regex used in data transformations and deciding what small (B> bug is causing them to fail given a certain input. (B (BYeah, but when a regex isn't acting how I expected it to, I know that (Bbecause I've already got in-hand an example of a string it matches (Bwhich I thought it wouldn't, or one it fails to match which I thought (Bit should. What I want to know is *why*--what part of the regex do I (Bneed to change. Generating strings which would have matched, wouldn't (Bseem to help much. (B (BAnd you might be underestimating how many strings can be generated from (Beven a simple regex, and how uninformative they could be. For example, (Bthe Perl5 regex /[a-z]{10}/ will match 141167095653376 different (Bstrings, and it would likely be a very long time before I'd find out if (Bthis would match any strings starting with "x". I'd probably be left (Bwith the impression that it would only match strings starting with (B"a". (B (B> Running a regular expression in reverse has IMO the best potential for (B> making (B> regexes transparent - you graphically see how they work and what they (B> match. (B (BHow graphically? (B (B> Why shouldn't that be reflected in the language itself? (B (BMaybe because if it's likely to be used mostly for debugging, and can (Bbe implemented in a library, then it doesn't need to be implemented as (Ban operator, and contribute to the general learning curve of the (Blanguage's syntax. (B (BJEff
Re: Why lexical pads
On Sep 25, 2004, at 10:27 PM, Larry Wall wrote: On Sat, Sep 25, 2004 at 10:01:42PM -0700, Larry Wall wrote: : We've also said that MY is a pseudopackage referring to the current : lexical scope so that you can hand off your lexical scope to someone : else to read (but not modify, unless you are currently compiling : yourself). However, random subroutines are not allowed access : to your lexical scope unless you specifically give it to them, : with the exception of $_ (as in 1 above). Otherwise, what's the : point of lexical scoping? Note that this definition of MY as a *view* of the current lexical scope from a particular spot is exactly what we already supply to an C, so we're not really asking for anything that isn't already needed implicitly. MY is just the general way to invoke the pessimization you would have to do for an C anyway. A mildly interesting thought would be for C to take additional parameters to make explicit what's visible to the eval'd code--essentially making the running of the code like a subroutine call. So the traditional C would turn into something like "eval $str, MY", but you could also have "eval $str, $x, $y", or just "eval $str", which would execute in an "empty" lexical scope. That would allow additional optimizations at compile-time (and make MY the sole transporter of lexical scope), since not every C would need what MY provides, but even more importantly, it would allow the programmer to protect himself against accidentally referencing a lexical he didn't intend, just because the code in his string coincidentally used the same variable name. More optimization opportunities, and more explicit semantics. But that's now a language issues, so I'm cc-ing this over to there. JEff
explicit laws about whitespace in rules
I'd like to know where EXACTLY whitespace is permitted in rules. Is it legal to write \c [CHARACTER NAME] or must I write \c[CHARACTER NAME] -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
comprehensive list of perl6 rule tokens
I'm working on a Perl 5 module that will allow for the parsing of a Perl 6 rule into a tree structure -- specifically, I'm subclassing/extending Regexp::Parser into Perl6::Rule::Parser. This module is designed ONLY to PARSE the contents of a rule; it is not concerned with the implementation of all the new things Perl 6 rules will offer, merely their syntax. Once this module is done, I'll work on a slightly broader one which will concern itself with the exterior of the rule (the m:xyz:abc('def')/.../ part, rather than the contents of the rule itself). To do this effectively, I need an exhaustive list of all tokens that can appear in a Perl 6 rule. By "token", I mean a single unit of purpose, such as ^^ and and **{3..6}. I have looked through the latest revisions of Apo05 and Syn05 (from Dec 2004) and come up with the following list: http://japhy.perlmonk.org/perl6/rules.txt The list is split up by leading character. I think it's complete, but I'm probably wrong, which is why I need more eyes to look over it and tell me what I've missed. I just got an email back from Damian which will help me move in the right direction, but I'd like this to be open to as many knowledgeable minds as possible. The part which needs a bit of clarification right now, in my opinion, is character classes. From what I can gather, these are character classes: <[a-z] +> <+ -[aeiouAEIOU]> but I want to be sure. I'm also curious about whitespace. Is "<[" one token, or can I write "< [a-z] >" and have it be a character class? Thanks for your help. Unless you're difficult. -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
On May 24, Jonathan Scott Duff said: On Tue, May 24, 2005 at 08:25:03PM -0400, Jeff 'japhy' Pinyan wrote: http://japhy.perlmonk.org/perl6/rules.txt That looks completish to me. (At least I didn't think, "hey! where's such and such?") Oh, frabjous day! One thing that I noticed and had to look up was <-prop X> though. Because ... I wish was allowed. I don't see why has to be confined to zero-width assertions. The part which needs a bit of clarification right now, in my opinion, is character classes. From what I can gather, these are character classes: <[a-z] +> <+ -[aeiouAEIOU]> I believe that Larry blessed Pm's idea to allow <[a..z]+digit> <+alpha-[aeiouAEIOU]> Ok, that's news to me. (I have yet to peruse the archives.) That's nice, not requiring you to <>-ize property names inside a character class assertion. I'd think whitespace would be permitted in between parts of a character class, but perhaps I'm wrong. That would kinda go against the whole "whitespace for readability" idea of Perl 6 rules, though. which implies to me that assertions starting with one of "<[", "<-" or "<+" should be treated as character classes. This doesn't seem to play well with <-prop X>. Maybe it does though. Considering the Unicode properties are like char class macro-things (like \w and \d), I don't see a problem, except for the fact that there's more than one "word" (chunk of non-whitespace) associated with them. Maybe Unicode properties retain their enclosing <>'s? Also, I think that it's [a..z] now rather than [a-z] but I'm not entirely sure. At least that's how PGE implements it. Ok. I'll wait for a message from On High about that. It's a minor detail. but I want to be sure. I'm also curious about whitespace. Is "<[" one token, or can I write "< [a-z] >" and have it be a character class? I think you need to write "<[" I expected as much. -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
On May 25, Jonathan Scott Duff said: On Tue, May 24, 2005 at 11:24:50PM -0400, Jeff 'japhy' Pinyan wrote: I wish was allowed. I don't see why has to be confined to zero-width assertions. I don't either actually. One thing that occurred to me while responding to your original email was that might have slightly wrong huffmanization. Is zero-width the common case? If not, we could use character doubling for emphasis: consumes, while is zero-width. But that's not even the point. The ! in is not what makes a zero-width assertion, it's the 'after' that does that. All the ! does is negate the boolean sense of the assertion, which seems like a useful thing to have. Hrm, but I think I see the problem. How does one define "negation" for an arbitrary assertion? Is saying "if matches, fail"? Because then doesn't make mean the same as <-prop X>. We don't want negation, we want complement. I guess '!' is only well-defined for zero-width assertions. When you want to say , I guess > or > is the proper way to go. -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
On May 25, Mark A. Biggar said: Jonathan Scott Duff wrote: On Tue, May 24, 2005 at 11:24:50PM -0400, Jeff 'japhy' Pinyan wrote: I wish was allowed. I don't see why has to be confined to zero-width assertions. I don't either actually. One thing that occurred to me while responding to your original email was that might have slightly wrong huffmanization. Is zero-width the common case? If not, we could use character doubling for emphasis: consumes, while is zero-width. Now is a character class just like <+digit> and so under the new character class syntax, would probably be written <+prop X> or if the white space is a problem, then maybe <+prop:X> (or <+prop(X)> as Larry gets the colon :-), but that is a pretty adverbial case so ':' maybe okay) with the complemented case being <-prop:X>. Actually the 'prop' may be unnecessary at all, as we know we're in the character class sub-language because we saw the '<+', '<-' or '<[', so we could just define the various Unicode character property codes (I.e., Lu, Ll, Zs, etc) as pre-defined character class names just like 'digit' or 'letter'. Yeah, that was going to be my next step, except that the unknowing person might make a sub-rule of their own called, say, "Zs", and then which would take precedence? Perhaps is a good way of writing it. BTW, as a matter of terminology, <-digit> should probably be called the complement of <+digit> instead of the negation so as not to confuse it with the negative zero-width assertion case. Yeah, I just wrote that in my recent reply to Scott. I realized the nomenclature would be a point of confusion. -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
On May 26, Patrick R. Michaud said: On Tue, May 24, 2005 at 08:25:03PM -0400, Jeff 'japhy' Pinyan wrote: I have looked through the latest revisions of Apo05 and Syn05 (from Dec 2004) and come up with the following list: http://japhy.perlmonk.org/perl6/rules.txt I'll review the list below, but it's also worthwhile to read http://www.nntp.perl.org/group/perl.perl6.language/21120 which is Larry's latest missive on character classes, and http://www.nntp.perl.org/group/perl.perl6.language/20985 which describes the capturing semantics (but be sure to note the lengthy threads that follow concerning changes in the indexing from $1, $2, ... to $0, $1, ... ). I'll check them out. Right now, I'm really only concerned with syntax rather than implementation. Perl6::Rule::Parser will only parse the rule into a tree structure. & a&b N conjunction &varN subroutine I'm not sure that "&var" means subroutine anymore. A05 does mention Ok. If it goes away, I'm fine with that. x**{n..m} N previous atom n..m times Keeping in mind that the "n..m" can actually be any sort of closure Yeah, I know. ( (x) Y capture 'x' ) Y must match opening '(' It may be worth noting that parens not only capture, they also introduce a new scope for any nested subpattern and subrule captures. Ok. I don't think that'll affects me right now. :ignorecase N case insensitivity :i :global N match globally :g :continue N start scanning after previous match :c ...etc I'm not sure these are "tokens" in the sense of "single unit of purpose" in your original message. I think these are all adverbs, and the "token" is just the initial C<:> at the beginning of a group. I understand, but that set is particularly important to me, because as far as I am concerned, the rule /abc/ is the object Perl6::Rule::Parser::exact->new('abc'), whereas the rule /:i abc/ is the object Perl6::Rule::Parser::exactf->new('abc') -- this is using node terminology from Perl 5, where "exactf" means "exact with case folding". :keepallN all rules and invoked rules remember everything That's now ":parsetree" according to Damian's proposed capture rules. Ok. I haven't seen those yet. N backtracking fails completely N remove what matched up to this point from the string N we must be after the pattern P N we must NOT be after the pattern P N we must be before the pattern P N we must NOT be before the pattern P As with ':words', etc., I'm not sure that these qualify as "tokens" when parsing the regex -- the tokens are actually "<" or " I understand. Luckily this new syntax will enable me to abstract things in the parser. my $obj = $S->object(assertion => $name, $neg); # where $name is the part after the < or Since there's no longer different prefixes for every type of assertion, I no longer need to make specific classes of objects. N match whitespace by :w rules N match a space character (chr 32 ONLY) Here the token is " Right. <$rule> N indirect rule <::$rulename> N indirect symbolic rule <@rules> N like '@rules' <%rules> N like '%rules' <{ code }>N code produces a rule <&foo()> N subroutine returns rule <( code )>N code must return true or backtracking ensues Here the leading tokens are actually "<$", "<::$", "<@", "<%", "<{", "<&", and "<(", and I suspect we have " Per your second message, <[EMAIL PROTECTED]> would mean >, right? Of course, one could claim that these are really separated as in "<", "?", and "$" tokens, but PGE's parser currently treats them as a unit to make it easier to jump directly into the correct handler for what follows. Yes, so does mine. :) <[a-z]> N character class <+alpha> N character class <-[a-z]> N complemented character class The tokens for character class manipulation are currently "<[", "<+", and "&
Re: comprehensive list of perl6 rule tokens
In regards to http://www.nntp.perl.org/group/perl.perl6.language/21120 which discusses character class syntax in Perl 6, I have some comments to make. First, I've been very interested in seeing proper set notation for char classes in Perl 5. I was pretty vocal about it during TPC in 2002, I think, and have since added some features that are in Perl 5 now that allow you to define your own Unicode properties with not only + and - and ! but & as well. If we want to treat character classes as sets, then we should try to use notation that reads properly. I don't see how '+' and '|' are any different in this case: <+Foo +Bar> and should produce the same results always. I suppose the + is helpful in distinguishing a character class assertion from any other, though. To *complement* a character class, I think the character ~ is appropriate. Intersection should be done with &. Subtraction can be provided with -, although it's really just a shorthand: A - B is really A & ~B... but I suppose huffman encoding tells us we should provide the - sign. Here are some examples, then: <+alpha -vowels>all alphabetic characters except vowels <+alpha & ~vowels> same thing <[a..z] -[aeiou]> all characters 'a' through 'z' minus vowels <[a..z] & ~[aeiou]> same thing <~(X & Y) | Z> all characters not in X-and-Y, or in Z The last example shows <~ which is currently unclaimed as far as assertions go. Since I'd be advocating the removal of a unary - in character classes (to be replaced by ~), I think this would be ok. The allowance for a unary + in character classes has already been justified. For the people who are really going to use it, the notation won't be foreign. And I'd expect most people who'd use it would actually abstract a good portion of it away into their own property definitions, so that <~(X & Y) | Z> would actually just be <+My_XYZ_Property> which would be defined elsewhere. What say you? -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
On May 26, Patrick R. Michaud said: N backtracking fails completely N remove what matched up to this point from the string N we must be after the pattern P N we must NOT be after the pattern P N we must be before the pattern P N we must NOT be before the pattern P As with ':words', etc., I'm not sure that these qualify as "tokens" when parsing the regex -- the tokens are actually "<" or " I'm curious if and "capture" anything. They don't start with '?', so following the guidelines, it would appear they capture, but that doesn't make sense. Should they be written as and , or is the fact that they capture silently ignored because they're not consuming anything? Same thing with and . And with and . It should be assumed that doesn't capture because it can only capture if P matches, in which case fails. So, what's the deal? -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart
Re: comprehensive list of perl6 rule tokens
Further woes, arguments, questions: In regards to <@array>, A5 says "A leading @ matches like a bare array..." but this is an over-generalization. A leading '@' merely indicates the rule is found in an array. <@array[3]> would be the same as <$fourth_element_of_array>, assuming those two values are identical. Next, about and . What is the justification for that syntax? There is no other example of a <-sequence with whitespace, at least that I can see. It would appear "RULE" is an argument of sorts to the 'before' and 'after' rules, but how do they access that argument? How do I write a rule that takes an argument? -- Jeff "japhy" Pinyan % How can we ever be the sold short or RPI Acacia Brother #734 % the cheated, we who for every service http://japhy.perlmonk.org/ % have long ago been overpaid? http://www.perlmonks.org/ %-- Meister Eckhart