Re: Semantics for regexes - copy/snapshot

2004-09-09 Thread Chip Salzenberg
According to Larry Wall: > I don't claim to follow all this talk about "stores" Think about tied values. When does STORE get called, precisely, on a tied target of s///? It's good to be explicit about this, down at the C API level, just so we know what to optimize for. The final answer is proba

Re: Semantics for regexes - copy/snapshot

2004-09-09 Thread Larry Wall
On Wed, Sep 08, 2004 at 11:00:54PM -0700, Steve Fink wrote: : I vote for leaving all of these sorts of cases undefined. Well, : partially defined -- I'd rather we didn't allow ($a = "aaa") =~ s/a/b/g : to turn $a into "gawrsh". At the very least, define the exact number of : output and stores for "

Re: Semantics for regexes - copy/snapshot

2004-09-08 Thread Steve Fink
On Sep-09, [EMAIL PROTECTED] wrote: > On Wed, 8 Sep 2004, Chip Salzenberg wrote: > > > According to [EMAIL PROTECTED]: > > > So how many stores do we expect for > > >($a = "xxx") =~ s/a/b/g > > > and which of the possible answers would be more useful? > > > > I think it depends on C<($a = "aaa

Re: Semantics for regexes - copy/snapshot

2004-09-08 Thread martin
On Wed, 8 Sep 2004, Chip Salzenberg wrote: > According to [EMAIL PROTECTED]: > > So how many stores do we expect for > >($a = "xxx") =~ s/a/b/g > > and which of the possible answers would be more useful? > > I think it depends on C<($a = "aaa") =~ s/a/b/g>. I would agree with you in general,

Re: Semantics for regexes - copy/snapshot

2004-09-08 Thread Chip Salzenberg
According to [EMAIL PROTECTED]: > So how many stores do we expect for >($a = "xxx") =~ s/a/b/g > and which of the possible answers would be more useful? I think it depends on C<($a = "aaa") =~ s/a/b/g>. * If the s/// operator stores once after all substitutions, then having it alway store

Re: Semantics for regexes - copy/snapshot

2004-09-07 Thread martin
On Tue, 7 Sep 2004, Leopold Toetsch wrote: > > [*] Unless it's a _feature_ that given tied $a, > >($a = "aaa") =~ s/a/b/g > > would call STORE four times ("aaa", "baa", "bba", "bbb"). > > I'd expect two stores here. One for the initial setting of the value and > one for the final result

Re: Semantics for regexes - copy/snapshot

2004-09-07 Thread Dan Sugalski
At 10:59 PM -0400 9/6/04, Chip Salzenberg wrote: Just across the hall from m// is s/// ... Considering the semantics of m// and especially s/// at the user level, we'll probably[*] want to take snapshots of dynamic strings (think P5's "FETCH" or overload '""'), and apply all the pattern operations

Re: Semantics for regexes - copy/snapshot

2004-09-07 Thread Leopold Toetsch
Chip Salzenberg <[EMAIL PROTECTED]> wrote: > For Topaz, Scalar's interface included a function that would basically > open the Scalar's hood, giving you a Buffer you could manipulate; then > when you were done working with the Buffer, its modifications (if any) > were propagated back down into the

Re: Semantics for regexes - copy/snapshot

2004-09-07 Thread Chip Salzenberg
Just across the hall from m// is s/// ... Considering the semantics of m// and especially s/// at the user level, we'll probably[*] want to take snapshots of dynamic strings (think P5's "FETCH" or overload '""'), and apply all the pattern operations to that snapshot. *However*, in the usual case

Re: Semantics for regexes

2004-09-04 Thread Chip Salzenberg
According to Dan Sugalski: > At 2:44 PM + 9/3/04, Chip Salzenberg wrote: > >According to [EMAIL PROTECTED] (Dan Sugalski): > >>*) extract substring > > > >Rather than that, wouldn't you prefer to make "substring of target > >string" the actual target of all these? > > Only if the resulting sub

Re: Semantics for regexes

2004-09-04 Thread Patrick R. Michaud
On Fri, Sep 03, 2004 at 02:44:52PM -, Chip Salzenberg wrote: > According to [EMAIL PROTECTED] (Dan Sugalski): > >*) extract substring > > Rather than that, wouldn't you prefer to make "substring of target > string" the actual target of all these? Yes, yes, yes, this would be far more useful.

Re: Semantics for regexes

2004-09-03 Thread Dan Sugalski
At 12:55 PM -0400 9/3/04, Chip Salzenberg wrote: According to Dan Sugalski: At 2:44 PM + 9/3/04, Chip Salzenberg wrote: >According to [EMAIL PROTECTED] (Dan Sugalski): >>*) extract substring > >Rather than that, wouldn't you prefer to make "substring of target >string" the actual target o

Re: Semantics for regexes

2004-09-03 Thread Dan Sugalski
At 2:44 PM + 9/3/04, Chip Salzenberg wrote: According to [EMAIL PROTECTED] (Dan Sugalski): *) extract substring Rather than that, wouldn't you prefer to make "substring of target string" the actual target of all these? Only if the resulting substring'd be used in the match. Otherwise you're be

Re: Semantics for regexes

2004-09-03 Thread Chip Salzenberg
According to [EMAIL PROTECTED] (Dan Sugalski): >*) extract substring Rather than that, wouldn't you prefer to make "substring of target string" the actual target of all these? >*) exact string compare >*) find string in string >*) find first character of class X in string >*) find first character

Re: Semantics for regexes

2004-09-02 Thread Aaron Sherman
Ok, I get it now, thanks Larry. I do still think that you can do what I suggest, but I realize that it's not as easy as handing around a single pad, you would actually need to maintain either a list of pads (outside of the built-in pad stack, probably inside of C<$0>) or a list of C<$0>s, each wi

Re: Semantics for regexes

2004-09-02 Thread Larry Wall
On Thu, Sep 02, 2004 at 10:43:48AM -0400, Aaron Sherman wrote: : On Wed, 2004-09-01 at 17:00, Larry Wall wrote: : > Okay, except that hypotheticality is an attribute of a variable's : > value, not of the pad it's in. : : Yes, I think I got that part, and perhaps I was being unclear or am : still m

Re: Semantics for regexes

2004-09-02 Thread Aaron Sherman
On Thu, 2004-09-02 at 11:27, Felix Gallo wrote: > Although the next regex engine has to deal with the horribly > crufty new perl6 syntax Keep in mind that Perl 6 regexen are really just Perl 5 regexen with a call stack and backtracking control. Absolutely everything else that I see in P6 is eithe

Re: Semantics for regexes

2004-09-02 Thread Dan Sugalski
At 12:19 PM -0400 9/2/04, Felix Gallo wrote: Dan writes: True enough. Oh, don't get me wrong, I think we can go faster than the perl 5 regex engine. I just don't think we can do in 2 seconds what takes perl 5 10 seconds... :-P Yeah, I meant the other way around. I know. :) Lacking any kind of

Re: Semantics for regexes

2004-09-02 Thread Felix Gallo
Dan writes: > True enough. Oh, don't get me wrong, I think we can go faster than > the perl 5 regex engine. I just don't think we can do in 2 seconds > what takes perl 5 10 seconds... :-P Yeah, I meant the other way around. Lacking any kind of formal specification for it, my general thought is

Re: Semantics for regexes

2004-09-02 Thread Dan Sugalski
At 11:27 AM -0400 9/2/04, Felix Gallo wrote: Dan writes: I don't think we're going to be able to manage doing our matches in 20% of the time of the current regex engine. That's a bit ambitious, even for me. :) I dunno, there are a number of extant cases of languages that manage to run regexes ju

Re: Semantics for regexes

2004-09-02 Thread Leopold Toetsch
Aaron Sherman <[EMAIL PROTECTED]> wrote: > A side point to Dan: In reading P6&PE, I don't see an op for deleting an > entry from a pad. $P0 = peek_pad delete $P0["foo"] Deleting by index/depth is unimplemented and marked as TODO in classes/scratchpad.pmc leo

Re: Semantics for regexes

2004-09-02 Thread Felix Gallo
Dan writes: > I don't think we're going to be able to manage doing our matches in > 20% of the time of the current regex engine. That's a bit ambitious, > even for me. :) I dunno, there are a number of extant cases of languages that manage to run regexes just as fast as the current regex engine.

Re: Semantics for regexes

2004-09-02 Thread Aaron Sherman
On Wed, 2004-09-01 at 17:00, Larry Wall wrote: > : Let's get concrete: > : > : rule foo { a $x:=(b*) c } > : "abbabc" > : > : So, if I understand Parrot and Perl 6 correctly (heh, fat chance), a > : slight modification to the calling convention of the closure that > : represents a rule (

Re: Semantics for regexes

2004-09-02 Thread Dan Sugalski
At 9:56 AM -0400 9/2/04, Felix Gallo wrote: Dan writes: [...] Yes, and some of the initial list already has ops to do those bits, though I fully plan on evil cheating versions for some extra speed. If I recall correctly, someone with the best intentions attempted to write a clear, object-oriente

Re: Semantics for regexes

2004-09-02 Thread Felix Gallo
Dan writes: > [...] > Yes, and some of the initial list already has ops to do those bits, > though I fully plan on evil cheating versions for some extra speed. If I recall correctly, someone with the best intentions attempted to write a clear, object-oriented (but still C/C++ based) regex engine

Re: Semantics for regexes

2004-09-02 Thread Dan Sugalski
At 8:24 PM -0700 9/1/04, Steve Fink wrote: On Sep-01, Dan Sugalski wrote: This is a list of the semantics that I see as needed for a regex engine. When we have 'em, we'll map them to string ops, and may well add in some special-case code for faster access. *) extract substring *) exact string

Re: Semantics for regexes

2004-09-01 Thread Steve Fink
On Sep-01, Dan Sugalski wrote: > > This is a list of the semantics that I see as needed for a regex > engine. When we have 'em, we'll map them to string ops, and may well > add in some special-case code for faster access. > > *) extract substring > *) exact string compare > *) find string in st

Re: Semantics for regexes

2004-09-01 Thread Larry Wall
On Wed, Sep 01, 2004 at 04:33:24PM -0400, Aaron Sherman wrote: : On Wed, 2004-09-01 at 16:07, Larry Wall wrote: : : > I see one other potential gotcha with respect to backtracking and : > closures. In P6, a closure can declare a hypothetical variable : > that is restored only if the closure exits

Re: Semantics for regexes

2004-09-01 Thread Patrick R. Michaud
On Wed, Sep 01, 2004 at 01:07:49PM -0700, Larry Wall wrote: > On Wed, Sep 01, 2004 at 01:57:32PM -0400, Dan Sugalski wrote: > : I promised Patrick this a while back but never got it, so here it is. > : > : This is a list of the semantics that I see as needed for a regex > : engine. When we have '

Re: Semantics for regexes

2004-09-01 Thread Aaron Sherman
On Wed, 2004-09-01 at 16:33, Aaron Sherman wrote: > rule foo { a $x:=(b*) c } In the rest of my message I acted as if that read: rule foo { a $x:=(b+) c } so, we may as well pretend that that's what I meant to say ;-) -- â 781-324-3772 â [EMAIL PROTECTED] â http://www.ajs.com/~a

Re: Semantics for regexes

2004-09-01 Thread Aaron Sherman
On Wed, 2004-09-01 at 16:07, Larry Wall wrote: > I see one other potential gotcha with respect to backtracking and > closures. In P6, a closure can declare a hypothetical variable > that is restored only if the closure exits "unsuccessfully". Within > a rule, an embedded closure is unsuccessful

Re: Semantics for regexes

2004-09-01 Thread Larry Wall
On Wed, Sep 01, 2004 at 01:07:49PM -0700, Larry Wall wrote: : We might have to use arbitrary code to match arrays and hashes as well, : if the opcodes support only scalar string matches. I really wasn't being very clear about this. For efficiency we may need "trie" support (or something like it)

Re: Semantics for regexes

2004-09-01 Thread Larry Wall
On Wed, Sep 01, 2004 at 01:57:32PM -0400, Dan Sugalski wrote: : I promised Patrick this a while back but never got it, so here it is. : : This is a list of the semantics that I see as needed for a regex : engine. When we have 'em, we'll map them to string ops, and may well : add in some special-

Semantics for regexes

2004-09-01 Thread Dan Sugalski
I promised Patrick this a while back but never got it, so here it is. This is a list of the semantics that I see as needed for a regex engine. When we have 'em, we'll map them to string ops, and may well add in some special-case code for faster access. *) extract substring *) exact string compar