String mortality
Two more problems found in string.c; these relate to the creation of temporary strings to hold results of transcoding, in string_concat and string_compare. As per the latest (I think) decision from Dan ("Avoiding the deadlands", 9th April: http://www.mail-archive.com/perl6-internals@perl.org/msg09072.html), the following patch does the following: 1) Add BUFFER_neonate_FLAG (actually renamed BUFFER_needs_GC_FLAG, since this is not used at present, and can always be added again if it is needed in the future) - feel free to change the name to anything you fancy 2) Add neonate counters to interpreter structure (I added three separate counters; as it is really only required as a flag to indicate that cleanup is needed, one would probably suffice) 3) Change GC routines to treat 'neonate' string/buffer headers in the same way as constants 4) Change string_concat and string_compare to set and clear the 'neonate' flag as required Still required as per the above-referenced decision: 1) Implement equivalent flag for PMCs (unless the 'immune' flag serves the same purpose?) 2) Procedure to clear neonate flag on all headers from time to time Note that this patch gives compiler warnings in string.c because of the 'const' attribute on the parameters, and therefore should not be applied in its current form; I'm sure somebody can figure out how best to resolve the warnings. -- Peter Gibbs EmKel Systems Index: include/parrot/interpreter.h === RCS file: /home/perlcvs/parrot/include/parrot/interpreter.h,v retrieving revision 1.40 diff -u -r1.40 interpreter.h --- include/parrot/interpreter.h 3 Apr 2002 04:01:41 - 1.40 +++ include/parrot/interpreter.h 22 Apr 2002 12:58:47 - @@ -142,6 +142,9 @@ requests are there? */ UINTVAL GC_block_level; /* How many outstanding GC block requests are there? */ +UINTVAL neonate_strings;/* How many protected newborn strings ? */ +UINTVAL neonate_buffers;/* How many protected newborn buffers ? */ +UINTVAL neonate_PMCs; /* How many protected newborn PMCs ? */ } Interp; #define PCONST(i) PF_CONST(interpreter->code, (i)) Index: interpreter.c === RCS file: /home/perlcvs/parrot/interpreter.c,v retrieving revision 1.84 diff -u -r1.84 interpreter.c --- interpreter.c 15 Apr 2002 18:05:18 - 1.84 +++ interpreter.c 22 Apr 2002 13:05:56 - @@ -497,6 +497,9 @@ interpreter->memory_collected = 0; interpreter->DOD_block_level = 1; interpreter->GC_block_level = 1; +interpreter->neonate_strings = 0; +interpreter->neonate_buffers = 0; +interpreter->neonate_PMCs = 0; /* Set up the memory allocation system */ mem_setup_allocator(interpreter); Index: include/parrot/string.h === RCS file: /home/perlcvs/parrot/include/parrot/string.h,v retrieving revision 1.35 diff -u -r1.35 string.h --- include/parrot/string.h 24 Mar 2002 22:30:06 - 1.35 +++ include/parrot/string.h 22 Apr 2002 12:59:23 - @@ -65,8 +65,8 @@ /* Private flag for the GC system. Set if the buffer's in use as * far as the GC's concerned */ BUFFER_live_FLAG = 1 << 12, -/* Mark the bufffer as needing GC */ -BUFFER_needs_GC_FLAG = 1 << 13, +/* Mark the bufffer as newborn, for protection from infant death */ +BUFFER_neonate_FLAG = 1 << 13, /* Mark the buffer as on the free list */ BUFFER_on_free_list_FLAG = 1 << 14, /* This is a constant--don't kill it! */ Index: resources.c === RCS file: /home/perlcvs/parrot/resources.c,v retrieving revision 1.45 diff -u -r1.45 resources.c --- resources.c 19 Apr 2002 01:33:56 - 1.45 +++ resources.c 22 Apr 2002 13:00:47 - @@ -341,7 +314,8 @@ STRING *string_array = cur_string_arena->start_STRING; for (i = 0; i < cur_string_arena->used; i++) { /* Tentatively unused, unless it's a constant */ - if (!(string_array[i].flags & BUFFER_constant_FLAG)) { + if (!(string_array[i].flags & +(BUFFER_constant_FLAG | BUFFER_neonate_FLAG))) { string_array[i].flags &= ~BUFFER_live_FLAG; } } @@ -353,7 +327,8 @@ Buffer *buffer_array = cur_buffer_arena->start_Buffer; for (i = 0; i < cur_buffer_arena->used; i++) { /* Tentatively unused, unless it's a constant */ - if (!(buffer_array[i].flags & BUFFER_constant_FLAG)) { + if (!(buffer_array[i].flags & +(BUFFER_constant_FLAG | BUFFER_neonate_FLAG))) { buffer_array[i].flags &= ~BUFFER_live_FLAG; } } Index: string.c === RCS file: /home/perlcvs/parrot/string.c,v retrieving revision 1.73 diff -u -r1.73 string.c --- string.c 15 Apr 2002 20:34:28
Re: Please rename 'but' to 'has'.
On Sun, 2002-04-21 at 10:59, Trey Harris wrote: > 0 has true > > my first reaction would be, "huh? Since when?" Dare I say... "now"? ;-) Sorry, someone had to say it. Personally, even though it sucks up namespace, I think what we're seeing here is a need for more than one keyword that are synonyms. "but" and "now" seem to cover a good deal of ground. 0 now true Is misleading, IMHO, as 0 is not now true. 0, in this context is an expression, and we're saying that that expression is now true. "but" conveys this much more clearly. However, as many have pointed out, there are a number of cases where but is equally misleading. Is there any problem with allowing both but and now? It might even be elegant to use both at the same time: $x now integer but true which is clearer to my eye than $x now integer now true which seems to change the properties of $x twice without reconciling the changes with each other. In any other language this would be unthinkable, but I think it fits nicely with Perl's philosophy. Not TMTOWTDI, which I think is often used to excuse the inexcusable, but the idea that Perl reflects the ways in which humans use language. We want to convey shades of meaning that do not translate directly to action. So, have I just lost it, or would it make sense to have now and but? Apologies to the person who started this thread. I know you thought "has" was ideal, and I understand why. It's just that between "but" and "now", I think you get more ground covered than you do with "has" and either one.
RE: Regex and Matched Delimiters
On Sat, 2002-04-20 at 05:06, Mike Lambert wrote: > > He then went on to describe something I didn't understand at all. > > Sorry. > > Few corrections to what you wrote: > > To avoid the problem of extending {} to support new features with a > character 'x', without breaking stuff that might have an 'x' immediately > after the '{', my proposal is to require one space after the { before the > real regex appears. I hope that you mean "one or more whitespace characters", not just a space. The following would be correct, no? /{| .* }/ Anything else would seem rather confusing to the average Perl programmer.
no money down idea for computed goto
I don't have the time right now to do this myself, so here is a simple idea to evaluate. Currently, the computed goto decode and dispatch is essentially: goto *ops_addr[ *cur_opcode ]; Now a big part of the gain of the prederef runops core comes from decoding each op once instead of each time it is executed. The prederef core does this by creating an array shadowing the byte code which stores pointers to the op functions for the decoded ops. One could modify the computed goto runops analagously, by creating a parallel array that stores the decoded label address of each op. Suppose the parallel array is pointed to by decoded_ops, then op dispatch would then look like : goto *decoded_ops[ cur_opcode - start_of_bytecode ]; The C compiler might be able to optimize away the explicit subtraction. If not one can do the equivalent pointer math, but I won't try to write that here. In the ideal case, where sizeof(opcode_t) == sizeof(void *), one could possibly cheat like the jit compiler does and overwrite the original bytecode instead of using a parallel array, but that may not be good. -- Jason
Re: [PATCH] intconst parameter type
At 12:03 PM +1000 4/19/02, Andrew J Bromage wrote: >G'day all. > >On Thu, Apr 18, 2002 at 09:09:59PM -0400, Dan Sugalski wrote: > >> I've applied this, with the exception of the branch and bsr ops. At >> the moment, I agree--I can't see any case where "if" or "gte" needs >> to have a variable target. (I can see it for branch, bsr, jump, and >> jsr, as those are partially for subroutine dispatch, so no changes >> there) > >OK, this raises a question: What _is_ the difference between branch and >jump, or bsr and jsr? The answer I assumed was that jump/jsr were for >variable targets and branch/bsr were for static targets. Is that wrong? Yup. The branches are relative to the current PC, the jumps take absolute addresses. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
Re: Regex and Matched Delimiters
On Sat, 2002-04-20 at 14:33, Me wrote: > [2c. What about ( data) or (ops data) normally means non-capturing, > ($2 data) captures into $2, ($foo data) captures into $foo?] Very nice (but, I assume you meant {$foo data})! This does add another special case to the regexp parser's handling of "$", but it seems like it would be worth it. Makes me think of the even slightly hairier: {&foo data} or even more hair-full: {&{$foo} data} for references. Where you capture into the usual positional, and then invoke foo with the variable as parameter. Would be pretty nice closure-wise: sub match_with_alert($re,$id,$ops,$fac,$pri) { openlog $id,$ops,$fac; my $alert = sub ($match) { syslog $pri, "Matched regexp: $match"; } return study /{&{$alert} $re}/; } my $m = match_with_alert('ROOT login',$0,0,LOG_USER,PRI_CRIT); for <> -> $_ { /$m/ } That would certainly be a handy thing that would set Perl apart from the pack of advanced regexp languages that don't support closures Some other things come to mind as well, but I'm not sure how evil they are. For example: sub decrypt($data is rw) { $data = rot13($data); } print "The secret message is: ", /^Encrypted: {&decrypt .*}/, "\n";
Re: Regex and Matched Delimiters
> Very nice (but, I assume you meant {$foo data})! I didn't mean that (even if I should have). Aiui, Mike's final suggestion was that parens end up doing all the (ops data) tricks, and braces are used purely to do code insertions. (I really liked that idea.) So: Perl 5Perl6 (data)( data) (?opsdata)(ops data) ({}) {} -- ralph
Re: Regex and Matched Delimiters
On Mon, 2002-04-22 at 14:18, Me wrote: > > Very nice (but, I assume you meant {$foo data})! > > I didn't mean that (even if I should have). > > Aiui, Mike's final suggestion was that parens end up > doing all the (ops data) tricks, and braces are used > purely to do code insertions. (I really liked that idea.) > > So: > > Perl 5Perl6 > (data)( data) > (?opsdata)(ops data) > ({}) {} I don't like that particular way of looking at things, but either way my comments about subroutines and closures still holds.
Subroutines...
Okay, I've been thinking about subroutines lately. A lot. I had planned on putting them off a bit until we'd gotten scratchpads and globals done, but I thin I'd as soon get this off for discussion, so maybe we can have the rough edges worked out by the time we have hashes. Subroutines, generally, are a pain. They carry far more than just a pointer to a chunk of bytecode or real code, and because of that the simple jsr is just not going to cut it. So it's dead. For subs, we have to worry about plain subs, subs that capture their lexical & global scopes, and subs that capture their stacks. We also need to know where to enter the sub (coroutines may change this), whether the sub's got a native-code component (for XS and JITted subs) and what the 'original' starting spot for the sub is in case it's been changed by coroutine yielding. So, with all that, there's just too darned much stuff needed to *not* call with a context object of some sort. So we're going to. Here's the protocol: 1) Sub calls are made with the call opcode. P0 is the subroutine context object. (Which is what we'd get out of the symbol table or from a closure creation) 2) On entry to a sub, you always start a new set of stack chunks. This'll facilitate continuations. 3) We're having a new rule--you may *not* take a continuation from within an opcode function! This is probably one of those "Well, Duh!" things but better to have it up front. 4) P1 is the continuation of the caller, *if* it's created. Which it doesn't have to be. CallCC fills this in, call doesn't. (Yeah, we're turning into Scheme. I'm horrified too) 5) P2 is the current object, also potentially empty, to facilitate method calls. (I don't think a method should be able to be a continuation, but the very thought of that makes my head hurt enough to not be able to think about it clearly) I think there's more, but that should probably suffice for now. I *am* nervous that this is making sub calls more expensive than I'd like 'em to be. -- Dan --"it's like this"--- Dan Sugalski even samurai [EMAIL PROTECTED] have teddy bears and even teddy bears get drunk
RE: Subroutines...
Dan Sugalski: # Okay, I've been thinking about subroutines lately. A lot. I had # planned on putting them off a bit until we'd gotten scratchpads and # globals done, but I thin I'd as soon get this off for discussion, so # maybe we can have the rough edges worked out by the time we have # hashes. # # Subroutines, generally, are a pain. They carry far more than just a # pointer to a chunk of bytecode or real code, and because of that the # simple jsr is just not going to cut it. So it's dead. # # For subs, we have to worry about plain subs, subs that capture their # lexical & global scopes, and subs that capture their stacks. # # We also need to know where to enter the sub (coroutines may change # this), whether the sub's got a native-code component (for XS and # JITted subs) and what the 'original' starting spot for the sub is in # case it's been changed by coroutine yielding. How about we instead declare that all subs have One True Entry Point, and the sub does whatever is needed there? Normal subs can just set up scoping and jump to the beginning of the sub's body; coroutines retrieve their context object and use it; XS and JIT call enternative; etc. That way we only pay for the overhead on subs that need it. # So, with all that, there's just too darned much stuff needed to *not* # call with a context object of some sort. So we're going to. Here's # the protocol: # # 1) Sub calls are made with the call opcode. P0 is the subroutine # context object. (Which is what we'd get out of the symbol table or # from a closure creation) # # 2) On entry to a sub, you always start a new set of stack chunks. # This'll facilitate continuations. # # 3) We're having a new rule--you may *not* take a continuation from # within an opcode function! This is probably one of those "Well, Duh!" # things but better to have it up front. # # 4) P1 is the continuation of the caller, *if* it's created. Which it # doesn't have to be. CallCC fills this in, call doesn't. (Yeah, we're # turning into Scheme. I'm horrified too) # # 5) P2 is the current object, also potentially empty, to facilitate # method calls. (I don't think a method should be able to be a # continuation, but the very thought of that makes my head hurt enough # to not be able to think about it clearly) If you need a continuation, you can just use a closure to generate a normal but anonymous sub with the object as a lexical, can't you? That way a continuation is just an object. (Of course, I could just be screwed up--I don't understand continuations well enough to be sure. # I think there's more, but that should probably suffice for now. I # *am* nervous that this is making sub calls more expensive than I'd # like 'em to be. --Brent Dax <[EMAIL PROTECTED]> @roles=map {"Parrot $_"} qw(embedding regexen Configure) #define private public --Spotted in a C++ program just before a #include
Re: [PATCH] intconst parameter type
G'day all. On Thu, Apr 18, 2002 at 09:09:59PM -0400, Dan Sugalski wrote: > >> I've applied this, with the exception of the branch and bsr ops. [...] On Mon, Apr 22, 2002 at 11:01:35AM -0400, Dan Sugalski wrote: > The branches are relative to the current PC, the jumps take > absolute addresses. So why do branch/bsr need register targets, as opposed to jump/jsr which certainly do? Cheers, Andrew Bromage
Re: Subroutines...
On Tue, Apr 23, 2002 at 09:28:29AM +1000, Andrew J Bromage wrote: > G'day all. > > On Mon, Apr 22, 2002 at 04:31:32PM -0400, Dan Sugalski wrote: > > > 3) We're having a new rule--you may *not* take a continuation from > > within an opcode function! This is probably one of those "Well, Duh!" > > things but better to have it up front. > > I see why you say this, but I'm not sure it's necessarily a good idea. > There are a few languages which rely on continuations within functions > (Prolog is the one that springs to mind, but there are others), and > without it, the generated code might get unnecessarily bloated. That wasn't my understanding. "opcode function" to me means the internal implementation of a single opcode. Assuming I am correct, I still don't quite get what the restriction is. Is this so that the interpreter is holding the whip at the moment stacks need to be juggled? What does this allow and disallow for extensions (eg an extension that defines its own opcode?) And what does this gain over an API entry to tell the interpreter that you're taking a continuation? I guess I just don't have enough of a mental model of how parrot will implement continuations to understand this. Is this similar to the analogous current situation, where opcodes can't muck with the program counter without using one of the funky 'goto POP()' family of macros?
Re: Subroutines...
G'day all. On Mon, Apr 22, 2002 at 04:31:32PM -0400, Dan Sugalski wrote: > 3) We're having a new rule--you may *not* take a continuation from > within an opcode function! This is probably one of those "Well, Duh!" > things but better to have it up front. I see why you say this, but I'm not sure it's necessarily a good idea. There are a few languages which rely on continuations within functions (Prolog is the one that springs to mind, but there are others), and without it, the generated code might get unnecessarily bloated. Cheers, Andrew Bromage
Re: Please rename 'but' to 'has'.
Aaron Sherman writes: : On Sun, 2002-04-21 at 10:59, Trey Harris wrote: : : > 0 has true : > : > my first reaction would be, "huh? Since when?" : : Dare I say... "now"? ;-) : : Sorry, someone had to say it. : : Personally, even though it sucks up namespace, I think what we're seeing : here is a need for more than one keyword that are synonyms. "but" and : "now" seem to cover a good deal of ground. : : 0 now true : : Is misleading, IMHO, as 0 is not now true. 0, in this context is an : expression, and we're saying that that expression is now true. "but" : conveys this much more clearly. However, as many have pointed out, there : are a number of cases where but is equally misleading. : : Is there any problem with allowing both but and now? It might even be : elegant to use both at the same time: : : $x now integer but true : : which is clearer to my eye than : : $x now integer now true : : which seems to change the properties of $x twice without reconciling the : changes with each other. : : In any other language this would be unthinkable, but I think it fits : nicely with Perl's philosophy. Not TMTOWTDI, which I think is often used : to excuse the inexcusable, but the idea that Perl reflects the ways in : which humans use language. We want to convey shades of meaning that do : not translate directly to action. : : So, have I just lost it, or would it make sense to have now and but? : : Apologies to the person who started this thread. I know you thought : "has" was ideal, and I understand why. It's just that between "but" and : "now", I think you get more ground covered than you do with "has" and : either one. Perl 6 will try to avoid synonyms but make it easy to declare them. At worst it would be something like: my sub operator:now ($a,$b) is inline { $a but $b } Larry
Re: Regex and Matched Delimiters
Me writes: : > Very nice (but, I assume you meant {$foo data})! : : I didn't mean that (even if I should have). : : Aiui, Mike's final suggestion was that parens end up : doing all the (ops data) tricks, and braces are used : purely to do code insertions. (I really liked that idea.) : : So: : : Perl 5Perl6 : (data)( data) : (?opsdata)(ops data) : ({}) {} Hmm. Let me spill a few beans about where I'm going with A5. I've been thinking similar thoughts about the problem of overloading parens so heavily in Perl 5, but I'm going in a slightly different direction with it. The basic principles for the new regexen are: * Parens always capture. * Braces are always closures. * Square brackets are always character classes. * Angle brackets are always metasyntax (along with backslash). So a first whack at the differences might be: Old New --- --- // // ??? ?pat? // or even m ??? /pat/x /pat/ /^pat$/m/^^pat$$/ /./s// or /<.>/ ??? \p{prop}<+prop> ??? \P{prop}<-prop> ??? space(or \h for "horizontal"?) {n,m} \t also \n also or (latter matching logical newline) \r also \f also \a also \e also \033same \x1Bsame \x{263a}\x<263a> ??? \c[ same \N{name} \l same \u same \Lstring\E \L \Ustring\E \U \E gone [\040\t]\h plus any Unicode horizontal whitespace [\r\n\ck] \v plus any Unicode vertical whitespace \b same \B same \A ^ \Z same? \z $ \G , but assumed in nested patterns? \1 $1 \Q$var\E$varalways assumed literal, so $1 is literal backref $var<$var> assumed to be regex =~ $re =~ /<$re>/ ouch? (??{$rule}) (?{ code }) { code } with failure semantics (?#...) {"..."} :-) (?:...) <:...> (?=...) (?!...) (?<=...) (? (?>...) (?(cond)t|f)Not sure. Could just use { if ... } Obviously the and syntaxes will be user extensible. We have to be able to support full grammars. I consider it a feature that looks like a non-terminal in standard BNF notation. I do not consider it a misfeature that resembles an HTML or XML tag, since most of those languages need to be matched with a fancy rule named anyway. An interesting idea would be that if you say m or m{code} it's as if you said m// or m/{code}/ The latter is particularly interesting to me in that I can see uses for patterns that are Perl code at the top level rather than regex literal. Any closure within a regular expression has full access to the current state object for the match. So most of the RFCs proposing ad hoc mechanisms for saving submatches in various kinds of variables can be handled with closures. /(...)(...)(...) { @array = .all } / or /(...) { $first = $+ } (...) { $second = $+ } (...) { $third = $+ }/ or / () () { .node = ["if",$1,$2] } / # shades of yacc or whatever. Could have a <$foo=...> as syntactic sugar, perhaps. But we need the general mechanism for building up parse trees of arrays of hashes of arrays of arrays of hashes of arrays of hashes of... I haven't decided yet whether matches embedded in the closure should automatically pick up where the outer match is, or whether there should be some explicit match op to mean that, much like \G only better. I'm thinking when the current topic is a match state, we automatically continue where we left off, and require explicit =~ to start an unrelated match. I also haven't committed to any particular mechanism for defining a set of related rules in a grammar. Obviously it needs to be a good enough mechanism to parse Perl and its variants, which means it probably needs to be OO based, and you make new grammars by derivation from the base grammar and overriding the rules you want to change. Sorry if this is a bit delirious--I'm fighting off some kind of infection, and my nights have been shortchanged lately by the neighborhood panhandler who doesn't seem to understand either complicated concepts like "bedtime" or simple concepts like "no". Larry
Re: Regex and Matched Delimiters
> (?=...) > (?!...) > (?<=...) > (? > (?>...) Yummy :) I'd say this is about perfect. The look(ahead|behind)s, er, look<:ahead|behind>s are used seldom enough that this is practical. And it's I much clea[nr]er than that (?=...) crap. (Think I'm going overboard with this tregext?) And are you going to reveal the method by which you define your own s, so we can overload it with personal ungrounded opinions? (On the other hand, it'd probably just stick and not move, because you said it.) > Sorry if this is a bit delirious--I'm fighting off some kind of > infection, and my nights have been shortchanged lately by the > neighborhood panhandler who doesn't seem to understand either > complicated concepts like "bedtime" or simple concepts like "no". bed...what? Luke
RE: Regex and Matched Delimiters
Larry Wall: # Me writes: # : > Very nice (but, I assume you meant {$foo data})! # : # : I didn't mean that (even if I should have). # : # : Aiui, Mike's final suggestion was that parens end up # : doing all the (ops data) tricks, and braces are used # : purely to do code insertions. (I really liked that idea.) # : # : So: # : # : Perl 5Perl6 # : (data)( data) # : (?opsdata)(ops data) # : ({}) {} # # Hmm. Let me spill a few beans about where I'm going with A5. # I've been thinking similar thoughts about the problem of # overloading parens so heavily in Perl 5, but I'm going in a # slightly different direction with it. The basic principles # for the new regexen are: # # * Parens always capture. # * Braces are always closures. # * Square brackets are always character classes. # * Angle brackets are always metasyntax (along with backslash). # # So a first whack at the differences might be: # # Old New # --- --- # //// ??? # ?pat? // or even m ??? Whoa, those are moving to the front?!? # /pat/x/pat/ # /^pat$/m /^^pat$$/ That's...odd. Is $$ (the variable) going away? # /./s // or /<.>/ ??? I think that . is too common a metacharacter to be relegated to this. # \p{prop} <+prop> ??? # \P{prop} <-prop> ??? Intriguing. # space (or \h for "horizontal"?) Same thinking as '.'. # {n,m} Ah, OK. # \talso # \nalso or (latter matching logical newline) # \ralso # \falso # \aalso # \ealso I can tell you right now that these are going to screw people up. They'll try to use these in normal strings and be confused when it doesn't work. And you probably won't be able to emit a warning, considering how much CGI Perl munches. # \033 same # \x1B same # \x{263a} \x<263a> ??? Why? Wouldn't we want the same thing to work in quoted strings? (Or are those changing syntaxes too?) # \c[ same # \N{name} # \lsame # \usame # \Lstring\E\L # \Ustring\E\U So that's changed from whenever you talked about \q{} ? # \Egone # [\040\t] \hplus any Unicode horizontal whitespace # [\r\n\ck] \v plus any Unicode vertical whitespace # # \bsame # \Bsame # \A^ # \Zsame? # \z$ Are you sure that optimizes for the common case? # \G, but assumed in nested patterns? # # \1$1 # # \Q$var\E $varalways assumed literal, so $1 is literal backref So these are reinterpolated every time you backtrack? Are you *trying* to destroy regex performance? :^) # $var <$var> assumed to be regex What if $var is a qr//ed object? # =~ $re=~ /<$re>/ ouch? I don't see the win. # (??{$rule}) # (?{ code }) { code } with failure semantics # (?#...) {"..."} :-) # (?:...) <:...> # (?=...) # (?!...) # (?<=...) # (? Cute. (Wait a minute, aren't those reversed?) # (?>...) # (?(cond)t|f) Not sure. Could just use { if ... } ? # Obviously the and syntaxes will be user # extensible. We have to be able to support full grammars. I # consider it a feature that looks like a non-terminal in # standard BNF notation. I do not consider it a misfeature # that resembles an HTML or XML tag, since most of those # languages need to be matched with a fancy rule named anyway. But that *does* make it harder to define the fancy rules. I could see someone defining rules like: 'gt' => qr/\ qr/\>/ just to get around backslashing everything in sight. # An interesting idea would be that if you say # # m # # or # # m{code} # # it's as if you said # # m// # # or # # m/{code}/ I don't know about that one. I often use {} as delimiters on regexen because it's a character that doesn't occur in data very often. I think the gain of two characters isn't as critical as the loss of options. Understand, I'm not a regex Luddite. I've been working with yacc and lex a lot lately, so I have at least a hint of how powerful formal parsing is--and I love all of these features. However, I think that syntactically a l