Re: Does ::: constrain the pattern engine implementation?

Deven T. Corzine Wed, 28 Aug 2002 10:50:31 -0700

On Wed, 28 Aug 2002, Larry Wall wrote:

> : (1) Can we have a ":study" modifier in Perl 6 for patterns?
> : 
> : It could be a no-op if necessary, but it could take the place of Perl 5's 
> : "study" operator and indicate that the programmer WANTS the pattern 
> : optimized for maximum runtime speed, even at the cost of compile time or 
> : memory.  (Hey, how about a ":cram" modifier for extreme optimization? :-)
> 
> Well, "studied" isn't really a property of a pattern--it's a property of a
> string that knows it will have multiple patterns matched against it.  One
> could put a :study on the first pattern, but that's somewhat deceptive.


Oh yeah.  I forgot it applied to the string, not the pattern!  I forgot 
since I never use it! :-)  Still, it could be considered a parallel...

Perhaps a better approach would be to allow the optimization priorities to 
be specified, perhaps even as numerical ranges for relative importance?  
The three obvious dimensions to quantify would be compile time, run-time 
speed, and memory usage.  There's often tradeoffs between these three, and 
allowing the ability for a programmer to specify his/her preferences could 
allow for aggressive optimizations that are normally inappropriate...

Of course, these would be useful not only as modifiers for compiling any 
regexes, but as general pragmas controlling optimizing behavior of the 
entire Perl 6 compiler/optimizer...

I'm not sure if it's good enough to just say "optimize for run-time speed 
at the expense of compile time and memory" (or variations for only having 
one of the two sacrificed) -- or it it's better to have a scale (say, 0-9) 
for how important each dimension is.

For the extreme case where long compile time and large memory usage is 
irrelevant, but extreme run-time speed is a must, the programmer might 
specify optimization priorities of compile=0, memory=0, speed=9.  I'm not 
sure what sort of syntax would be appropriate for such specifications...

> : (2) Would simple alternation impede DFA-style optimizations?
> : 
> : Currently, a pattern like (Jun|June) would never match "June" because the 
> : "leftmost" match "Jun" would always take precedence, despite the normal 
> : "longest-match" behavior of regexes in general.  This example could be 
> : implemented in a DFA; would that always be the case?
> 
> Well, "June" can match if what follows fails to match after "Jun".

True enough.  Couldn't that still be implemented in a DFA?  (Possibly at 
the cost of doubling the size of the DFA for the later part of the regex!)

> : Would it be better for the matching of (Jun|June) to be "undefined" and 
> : implementation-dependent?  Or is it best to require "leftmost" semantics?
> 
> Well, the semantics shouldn't generally wobble around like that, but it'd
> be pretty easy to let them wobble on purpose via pragma (or via :modifier,
> which are really just pragmas that scope to regex groups).

Yeah, it's probably safer not to have that much room for undefined behavior 
since people will just try it and assume that their implementation is the 
universal behavior...

Would there be a good way to say "don't care" about the longest-vs-leftmost 
matching semantics?  Would it be worthwhile to have longest-trumps-leftmost 
as an optional modifier?  (This might be very expensive if implemented in a 
backtracking engine, since it could no longer shortcut alternations...)

Dan suggested ":dfa" for DFA semantics -- is that the best answer, or would 
it be better to define the modifiers in terms of visible behavior rather 
than implementation, if possible?

Deven

Re: Does ::: constrain the pattern engine implementation?

Reply via email to