I've had an idea brewing for a while, and since talk seems to have turned
to reg^H^H^Hpatterns and rules again, I figured this might be the time to
mention it.

A while ago someone asked about whether backtracking semantics are
mandatory in any implementation, or whether it would be legal to build an
implementation that, for instance, has no preference among alternatives.
I propose that since the empty pattern is no longer legal (and about
time), we use "|" in patterns to indicate alternation without
preference, and "||" to indicate "try the first, then the second, etc".
So

"cat dog fox" ~~ /( fox | dog | cat )/;

might match any of the three, and in fact is quite likely to match cat
(since scanning the string could be a very long process), but

"cat dog fox" ~~ /( fox || dog || cat )/;

would be guaranteed to first try fox, then if that fails backtrack and
try dog, and if that fails backtract to try cat.  The choice of symbols
parallels the junction and short-circuiting or operators, and in most
cases when an alternation is specified, the programmer has no preference
among the alternatives (indeed, in most cases only would could match
anyway), so it seems silly to force the engine to prefer the first one.
I'm imagining that in the first example, the implementation would
probably build an FSA and process each letter as it comes, while the
second would rely on backtracking.

What think you all?
-- 
Adam Lopresto
http://cec.wustl.edu/~adam/

There's too much blood in my caffeinestream.

Reply via email to